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ABSTRACT: Computational theories of structure from motion [Ullman, 1979] and stereo vision 
[Marr and Reggio, 1979] only specify the computation of three-dimensional surface information at 
special points in the image. Yet, the visual perception is clearly of complete surfaces. In order to 
account for this, a computational theory of the interpolation of surfaces from visual information is 

presented. 

The problem is constrained by the fact that die surface must agree with the information from 
stereo or motion correspondence, and not vary radically between these points. Using the image 
inadiance equation [Horn, 1977], an explicit form of this surface consistency constraint cm be derived 

[Grimson. 1981c]. 

To determine which of two possible surfaces is more consistent with die surface consistency 
constraint, one must be able to compare the two surfaces. To do this, a functional from the space 
of functions to the real numbers is required. In this way, the surface most consistent with the visual 
information will be that which minimizes the functional. To ensure that die functional has a unique 
minimal surface, conditions on the form of the functional are derived. In particular, if die functional 
is a complete semi-norm which satisfies the parallelogram law, or the space of functions is a semi- 
Hilbert space and the functional is a semi-inner product, dicn tiierc is a unique (to within an clement 
of die null space of the functional) surface which is most consistent with die visual information. 

It can be shown, based on the above conditions plus a condition of rotational sytnmetiy, diat 
there is a vector space of possible functionals which measure suiface consistency, this vector space 
being spanned by the functional of quadratic variation and the functional of square Laplacian [Brady 
and Horn, 1981]. Arguments based on the null spaces of the respective functionals are used to justify 

the choice of the quadratic variation as the optimal functional. 

Algorithms for computing die surface which minimizes quadratic variation in die case of exact 
surface interpolation and in die case of surface approximation are outlined and illustrated on a series 
of synthetic and actual surface interpolation examples. 
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1. Introduction 


Although our world has three spatial dimensions, the projection of light rays onto the retina 
presents our visual system with an image of the world that is inherently two-dimensional. We must 
use such images to physically interact with this three-dimensional world, even in situations new to 
us, or with objects unknown to us. That we do so easily implies that one of the functions of the 
human visual system is to reconstruct a three-dimensional representation of the world from its two- 
dimensional projection onto our eyes. 

Methods that could be used to effect this three-dimensional reconstruction include stereo vision 
[Marr and Poggio, 1979; Grimson, 1980,1981a; Mayhew and Frisby, 1981] and structure from motion 
[Uliman, 1979a], Both of these methods may be considered as correspondence techniques, since 
they rely on establishing a correspondence between identical items in different images, and using the 
difference in projection of these items to determine surface shape. That is, correspondence methods 


compute smlace information by: 

(1) Identifying a location in the physical scene in one image; 

(2) Identifying the corresponding location in a second image; either a second image taken from a 
different viewpoint in space (stereo) or a second image taken from a different viewpoint in time 
(structure from motion); and 

(3) Computing a three-dimensional surface value, representing the distance of the point relative to 
some base point, based on the difference in the positions of the two corresponding points in the 
images. 

Current computational theories of these processes [Marr and Poggio, 1979; Mayhew and Frisby, 
1981; Ullrnan, 1979a] argue that the correspondence process cannot take place at all points in an 
image. Rather, the first stage of the correspondence process is to derive a symbolic description of 
points in the image at which the irradiance undergoes a significant change [Marr and Hildreth, 1980]. 
This symbolic representation (called the primal sketch [Marr, 1976; Marr and Hildreth, 1980]) forms 
the input to the second stage of the process in which the actual correspondence is computed. As a 
consequence of the form of the input, the correspondence process can compute explicit surface infor¬ 
mation only at scattered points in the image. Vet our perception is clearly of complete surfaces. (For 
example, in Figure 1, a sparse random dot stereogram yields the vivid perception of a square floating 
in space above a background plane, rather than a collection of dots suspended in space.) The problem 
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Figure 1. A Sparse Random Dot Pattern. Although the density of clots is very small, the perception 
obtained upon fusing this pattern is one of two disjoint planes, rather than dots isolated in depth. 


to be addressed in this paper is that of computing complete surface representations, by interpolating 
an initial representation consisting of sparse surface values. 

We will examine this surface interpolation problem at two levels. The first level is to consider 
the strictly madicmatical question of surface reconstruction, independent of its relevance to the 
human visual system. Suppose one is given a visual process which determines surface information 
at points corresponding to relevant changes in the images. In general, there will be many possible 
surfaces consistent with these initial surface points. For example, consider the boundary conditions 
provided by a circular arc, along which the depth is constant. I he possible surface consistent with 
these know'll points include a flat disc, a sphere and even the highly convoluted surface formed by a 
radial sine function (see Figure 3). How do we distinguish the correct one? Mathematically, we need 
to be able to compare two possible surfaces, in order to determine which is “better”. This can be done 
by defining a functional 0 from the space of surfaces to the real numbers, so that comparing surfaces 
can be accomplished by comparing corresponding real numbers. Provided 0(/) < &(g) whenever 
surface / is “better” than surface g, the “best” surface to fit through the known points is that which 
minimizes 0. There are two problems to solve Piece: (1) What does it mean for / to lie “better” than 
p'? and (1) Under what conditions does a unique “best" surface exist? 
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Once these questions have been answered and an appropriate functional has been derived, we 
can turn to the second level, which is to consider a specific algorithm for finding the surface that 
optimizes the functional. Because our intent is to consider models for the interpolation process as it 
occurs in the human visual system, we will restrict our attention to biologically feasible algorithms 
[Ullman, 1979b; Grimson, 1981b]. 

The motivation for considering the interpolation problem first mathematically, independent of 
the specifics of the human system, and then algorithmically, incorporating specific biological con¬ 
straints, is based on the assumption that one can consider the visual system as a symbol manipulation 
process [Marr 1976,1981; Marr and Poggio, 1977]. This implies that the meaning of the symbols being 
manipulated can be distinguished from the physical embodiment of those symbols. Hence, one can 
deal with the mathematical consideration of the information processing which is occurring, independ¬ 
ent of the implementation of that processing (whether in transistors or neurons). The rationale for 
this view lies in the belief that any computational theory should address the fundamental questions 
of the information processing necessary to perform the task, and that such computational theories 


are independent, to a large extent, of the method used to compute them. The initial goal is thus to 
determine computational constraints on the interpolation problem, based on the input and output 


representations of the process, and based on the structure of the computation required to transform 
one representation into die other. Note that a computational theory of the information process¬ 
ing is applicable both to the human visual system, and to applications areas (such as high-altitude 
photomapping, hand-eye coordination systems, industrial robotics, and inspection of manufactured 
parts) where it is useful to create a complete specification of surface shape. 

While we shall initially concentrate on the mathematical aspects of visual surface interpolation, 
the problem is not completely isolated from the human visual system. If we view the human early 
visual system as a symbolic manipulator, we can consider visual processing as a scries of transfor¬ 
mations from one representation to another [Marr, 1976, 1981], In particular, three stages can be 
identified (see Figure 2). Front the images, one transforms to a description, called the primal sketch, 
of those locations at which the image irradiaitccs change. Next, primal sketch descriptions of several 
images are matched, either by the stereo or motion computation, to obtain a description of surface 
information at the zero-crossings. This representation is called the raw 2.^1) sketch. Finally, the raw 
27, l) sketch is interpolated to obtain complete surface descriptions, called the full 2^1) sketch |Marr, 




* 



CONSEQUENCE OF THE CORRESPONDENCE PROBLEM 4 

1978; Marr and Nishihara, 1978]. The first two stages have been considered elsewhere [Marr, 1976; 
Marr and Hildreth, 1980; Hildreth, 1980; Marr and Poggio, 1979; Grimson, 1980, 1981a, 1981b; 
Ullman, 1979a]. It is the final stage — the problem of surface interpolation — that is considered here, 
to obtain complete surface descriptions, called the full 2 jD Sketch [Marr, 1978; Marr and Nishihara, 
1978]. The first two stages have been considered elsewhere [Marr, 1976; Marr and Hildreth, 1980; 
Hildreth, 1980; Marr and Poggio, 1979; Grimson, 1980,1981a, 1981b; Ullman, 1979a]. It is the final 
stage - the problem of surface interpolation - that is considered here. 

The important point is that the form of the input and output representations can influence the 
design of the transformation. Here, we shall assume that die input representation consists Of explicit 
surface information, such as distance or reladve distance, along the zero-crossings of the convolved 
image (diese terms will be given technical definitions in Section 2). The output representation will be 
a complete specification of surface information, where by complete, we mean that an explicit distance 
value should be computed at every point on some grid representation of die scene. Our main concern 
in this paper is with the computational constraints needed to transform the input representation into 
the output representation. 

Although surface values at all points of the image are important, there is another aspect of 
surface information which should also be made explicit in the output representation. This is the set of 
discontinuities in surfaces; the occluding contours, both subjective and objective. Marr [1978] argues 
that the 2^-D sketch should be a viewer-centered representation which includes both explicit surface 
information, such as deptii and surface orientation, and explicit contours of surface discontinuities. In 
this paper, the concentration is on the problem of creating explicit surface information at all points 
of die surface. The question of surface discontinuities will be outlined, and possible algorithms 

o 

suggested, but an implementation of this stage has not been completed. 


2. Consequence of the Correspondence ProbfeTn 


We indicated above that we would concentrate on correspondence methods which could 
effect die three-dimensional surface reconstruction; stcrcopsis [Marr, 1980; Marr and Poggio, 1979; 
Grimson, 1980, 1981a] and structure from motion [Ullman, 1979a]. The three main steps of the 
correspondence problem arc: (1) identify a location in the physical scene in one image; (2) identify 
the corresponding location in a second image; either a second image taken from a different viewpoint 
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(stereo vision), or a second image taken at a later point in time (motion); and (3) compute a three- 
dimensional surface value, representing the distance of the point relative to some base point, based on 
die difference in the positions of the two corresponding points in the images. 

If one can identify a location beyond doubt in the two images, then the correspondence problem 
is trivial. It can be demonstrated, however, that both the stereo computation and the motion computa¬ 
tion can take place on very primitive descriptions of die images [Julesz, 1960; Ullinan, 1979a], As 
a consequence, the difficulty of the problem, for human vision, lies in the correspondence problem 
— which item in one image matches which item in the other. The reason for this is that for any primi¬ 
tive element from one description, there are liable to be many possible matching elements from the 
other description. This is especially true if image irradiance values are used as the basic descriptions. 
Consider an image of a matte-painted wall with uniform illumination. Given a small element of that 


wall from one image, it is virtually impossible to distinguish which small element from the other view 


matches it. On the other hand, if there is a scratch or texture marking on the wall, it is likely that such 
a location can be distinguished in the two views. This suggests that die representation upon which 
the coriespondence operation takes place should reflect diose positions in an image at which some 

physical property of the underlying surface is changing. This representation is called die primal sketch 
[Marr, 1976; Marr and Hildreth, 1980]. 


Mair and Hildieth [1980, also Hildreth, 1980] have refined the preceeding intuitive argument 

into more rigorous computational arguments, in conjunction with evidence from neurophysiology and 

psychophysics. They argue that the primal sketch representation is computed by determining those 

locations in an image at which the corresponding surface location undergoes a change in one of its 

physical properties, for example, reflectivity, surface orientation, texture or surface material. Such 

changes will correspond to a step change in image irradiance, at some scale. There are many ways of 

detecting the irradiance changes. Marr and Hildreth argue on various grounds for using the following 
scheme: 


(1) Convolve the image with a set of filters given by the Laplacian applied to a Gaussian, 


V 2 G(r. 0 ) 



where a is a constant determined fr 


om psychophysical data. 


) 
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(2) Locate all non-trivial zero-crossings in the convolved irradiances. A non-trivial zero-crossing is a 
point at which the convolved irradiance values change from positive to negative, or vice versa. 
These zero-crossings form the basic representation upon which the later visual processing takes place. 

Given this representation of the images, the correspondence problem can now be solved. Both 
Ullman’s [1979a] theory of motion perception and Marr and Poggio’s [1979] theory of stereo vision 
perform this computation on such primal sketch descriptions. As a consequence, explicit three- 
dimensional surface information (such as distance, or surface orientation) can be computed only at 
points corresponding to zero-crossings in the primal sketch. This would yield a sparse surface repre¬ 
sentation. Yet clearly, our perception is of complete surfaces (see for example Figure 1). In addition, 
a “nice” boundary is found for the central square. This implies that once the correspondence problem 
is solved, either by the stereo computation or by the motion computation, an interpolation must 
be performed between the surface values given at the zero-crossings, to obtain a complete surface 
description, and surface discontinuities should be explicitly demarked. 


3. The Surface Consistency Constraint 


We now turn to the heart of the matter, the computational constraints involved in the process 
of creating complete surface specifications, by interpolating between known points. We are given 
as basic input to the interpolation process, the zero-crossings of a convolved image, with depth infor¬ 
mation computed along these zero-crossing contours. Suppose one were to attempt to construct a 
complete surface description based only on the surface information known along the zero-ciossings. 
An infinite number of surfaces would consistently fit the boundary conditions provided by these sur¬ 


face values. Yet there must be some way of deciding which surface, or at least which small family 
of surfaces, could give rise to the zero-crossing descriptions. 'Phis means that there must be some 
additional information available from the visual process which, when taken into account, will identify 
a class of nearly indistinguishable surfaces that represent the visible surfaces of the scene. 

In order to determine what information is available from the visual process, one must first 
carefully consider the process by which the zero-crossing contours are generated. The Marr-Hildreth 
theory of edge detection [Man and Hildreth, 1980; Hildreth, 1980] relies on the fact that sudden 
changes in the reflectance of a surface, for example, caused by surface scratches or texture markings, 
will give rise to zero-crossings in the convolved image. Sudden or sharp chang.es in orientation or 
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shape of the surface will under most circumstances also give rise to zero-crossings. This fact can be 
used to constrain the possible shapes of surfaces which could give rise to particular surface' values 
along zero-crossing contours. 

We illustrate the basic argument with an example. Suppose one is given a closed zero-crossing 
contour, within which there are no other zero-crossings. An example would be a circular contour, 
along which tire depth is constant. There are many surfaces which could fit this set of boundary 
conditions (see Figure 3). One such surface is a fiat disk. However, one could also fit other smooth 
surfaces to this set of boundary conditions. For example, the highly convoluted surface formed 
by sin^-y/z 2 + y‘ 2 ) would be consistent with die known disparity values. Yet in principle, such a 
rapidly varying surface should give rise to other zero-crossings. This follows from the observation that 
if the surface orientation undergoes a periodic variation, then it is likely that the irradiance values 
will also undergo such a variation. Since the only zero-crossings are at the borders of the object, this 


implies that the surface sin 
conditions. 


y 2 ) is not a valid representative surface for this set of boundary 


Hence, the hypothesis is that the set of zero-crossing contours contains implicit information 
about the surface as well as explicit information. If one can determine a set of conditions on the 
surface shape diat cause inflections in the irradiance values, dien one may be able to determine a 
likely surface structure, given a set of boundary conditions along the zero-crOsSing contours. 

3.1 No News is Good News 

In general, any one of a multitude of widely varying surfaces could fit die boundary conditions 
imposed along the zero-crossings. To be completely consistent with the imaging pTOceSs, however, 
such surfaces must meet both explicit conditions and implicit conditions. The explicit conditions 
are given by the depth or surface orientation values along the zero-crossing contours. The implicit 
conditions are that the surface must not impose any zero-crossing contours other than those which 
appear in the convolved image. This implicit condition leads to the 'surface 'consistency constraint, 


namely: 


The absence of zero-crossings constrains the possible Surface shapes. 


Just as the presence of a zero-crossing tells us that some physical property is changing at a given 
location, the absence of a zero-crossing tells us the opposite, that no physical property is changing. 









Figure 3. Possible Surfaces Fitting Depth Values at Zero-Crossings. Given boundary conditions 
of a circular zero-crossing contour, along which the depth is constant, there are many possible 
surfaces which could fit the known depth points. Two examples ai e a flat disk, and the highly 

convoluted surface formed by sin (n/® 7 -f- y 2> j, shown here. 




and in particular that die surface topography is not changing in a radical manner. We informally refer 
to this constraint as no news is good ncws^ since it says diat the surface cannot change radically without 

informing us of this fact by means of zero-crossings. 

In order to make explicit any constraints on the shape of the surface for locations in the image 

not associated with a zero-crossing, one must carefully examine the image formation process. Many 
factors are involved in die formation of image irradiances. As a consequence, changes in any of those 
factors can cause a change in the image irradiances, and hence a zero-crossing in the convolved image. 
For example, a change in surface material can correspond to a change in albedo, and hence to a zero- 
crossing in the convolved image* a discontinuity in depth can coiicspond to a change in the illumina 

if a 

lion striking the surface, and hence to a zero-crossing,: and a discontinuity in surface orientation can 
correspond to a change in the amount of illumination reflected toward the viewer, and hence to a 


* 
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zero-crossing. We are interested in showing that the inverse is also true — in particular, that in regions 
in which the illumination and albedo are roughly constant, the absence of a zero-crossing implies that 
the surface shape cannot be changing in a radical manner. 

The basic result, corresponding to the intuitive argument of Figure 3, is that the probability of a 
zero-crossing occurring in regions where the illumination is roughly constant and the surface material 
does not change is a monotonic function of the variation in the orientation of the surface normal. 
(An analytic proof may be found in Grimson [1981c].) This provides a constraint on the possible 
surfaces that could be interpolated through a set of known points, and is referred, to as the surface 
consistency constraint. It means diat the probability of a zero-crossing increases as the variation in 
surface orientation increases. By inverting this argument, the best surface to fit the known data is that 
which minimizes the variation in surface orientation since this surface is most consistent with the zero- 
crossings in the convolved image. 


4. The Computational Problem 


We are now ready to consider die computational problem associated with the task of construct¬ 
ing complete surface specifications consistent with the information contained, in the zero-crossings. 
The modules of early visual processing, such as stereo or structure-from-motion, provide explicit 


information about die shapes of the surfaces at specific locations in die images, corresponding to die 
zero-crossings of the convolved images. The surface consistency constraint indicates implicit informa¬ 
tion about the shapes of the surfaces between the zero-crossings, stating that between known depth 
values, die surface cannot change in a radical manner, since such changes would usually give rise 
to additional zero-crossings. These two factors will now be combined, to obtain a complete surface 
specification. 


4.1 Using The Surface Consistency Constraint 


Suppose we are given a set of known depth points. We want a method for finding a surface 
to lit through these points that is “most consistent” with the surface consistency constraint. We will 
find the most consistent surface in two ways. In the surface interpolation problem we construct a 
surface th.it exactly fits the set of known points. The problem can be relaxed somewhat into a surface 






approximation problem, by only requiring that the surface approximately fit the known data and be 
smooth in some sense. 

Given the initial boundary conditions of the known depth values along the zero-crossing con¬ 
tours, there is an infinite set of possible surfaces that fit through those points. We need to be able to 
compare members of this set of all possible surfaces fitting through those points, to determine which 
surface is more consistent. If we can do this, then the “most consistent” surface can be found by com¬ 
paring all possible surfaces. A traditional method for comparing surfaces is to assign a real number 
to each surface. Then, in order to compare tire surfaces, one need only compare tire corresponding 
real numbers. The assignment of real numbers to possible surfaces is accomplished by defining a 
functional, mapping the space of possible surfaces into the real numbers, ©.AT i-+ This functional 
should be such that the more consistent the surface, the smaller the real number assigned to it. In 
order to satisfy the surface consistency constraint, the functional should measure variation in surface 
orientation. In this case, the most consistent surface will be the surface that is minimal under the 
functional. (For further details and background information about the use of functionals is, see, for 


example, Rudin [1973].) 

The key mathematical difficulty is to guarantee the existence and uniqueness of a solution. In 
other words, we need to guarantee that there is at least one surface which minimizes the surface 
consistency constraint, and to guarantee that any other surface passing through the known points, for 
which the functional measure of surface consistency has the same value, is indistinguishable from the 
first surface. This issue is not just a mathematical nicety, however, but is essential to the solution 
of many computational problems. Suppose we devise an iterative algorithm to solve some problem. 
What happens if we cannot guarantee the existence of a solution? The iterative process could be set 
off to solve an equation and never converge to an answer — clearly an undesirable state. Further, 
suppose a solution exists but is not guaranteed to be unique. Then an iterative process might converge 
to one solution when started from one initial point, and converge to another solution when started 
from a different initial point. Although small variations in the different solutions might be acceptable, 


the solutions should not differ in a manner inconsistent with our intuition about the problem. 1 bus, in 
the case of visual surface interpolation, the real trick is to find a functional which accurately measures 


the variation in surface orientation, as well as guarantees the existence of a unique best surface (or a 


family of indistinguishable surfaces). 
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How can we guarantee the existence and uniqueness of a solution? In our particular case of 
surface interpolation, we will be using the calculus of variations to determine a system of equations 
which the most consistent surface must satisfy, by applying the calculus to the situation of fitting a 
thin plate through a set of known points. While this system of equations characterizes the minimal 
surface, it does not guarantee uniqueness. The form of die boundary conditions (the set of known 
points) will determine the size of the family of minimal surfaces. Unfortunately, determining the 
types of input for which a unique solution exists is generally very hard. Instead, we will exploit a 
general case of the mathematical existence of a solution with tire weakest possible conditions on the 
functional. That is, we will determine a weak set of conditions on tire functional that are needed to 
ensure that a unique most consistent surface, or at least a unique family of surfaces that are most 
consistent, will exist. We will show that if the functional is an inner-product on a Hilbert space of 
possible surfaces, dien a unique most consistent surface will exist. (A Hilbert space is an extension of 
normal Euclidean space — basically a vector space in which a dot product operation exists, so that we 
can relate tire vectors to the real line, and in which functions are usually used in place of the normal 
notion of vector.) 

In general, it is extremely difficult to find a functional that measures surface consistency and 
satisfies the conditions of an inner-product. Hence, we will show that if the functional is a semi-inner 
product on a semi-Hilbert space of possible surfaces, then die most consistent surface is unique up to 
possibly an element of die null space of die functional. The null space is simply die set of surfaces 
diat cannot be disdnguished by the functional from die surface which is zero everywhere. In this 
way, the family of most consistent surfaces can be found. Based on the form of the null space, we 
can determine whether or not the differences in minimal surfaces arc intuitively indistinguishable, and 
what conditions on the known boundary values will guarantee a unique minimal surface, from diis 
family. 

Having derived conditions on die functional, we need to show that there is such a functional. 
The surface consistency constraint implies that the functional should measure variation in surface 
orientation over an area of the surface. Although the condition of a semi-inner product is a mathe¬ 
matical requirement needed to guarantee a solution, it does not restrict in an unreasonable way the 
kinds of surfaces we consider, and gives rise to at least two very natural functionals, both of which can 
be derived from die calculus of variations: one measures the integral of the square I aplacian applied 




to the surface and the other measures the quadratic variation of the local x and y components of the 
surface orientation. Besides the mathematical justifications, we will also show practical and intuitive 
reasons in support of such functionals. 

Given that there are at least two possible functionals, are there others? We will show that if we 
require a functional that is: 


1. a monotonic function of the variation in surface orientation, 

2. a semi-inner product, and 

3. rotationally symmetric, 

then there is a vector space of possible functionals, spanned by the square Laplacian and the quadratic 
variation. In other words, there is a family of possible functionals, given by all linear combinations of 


these two basic functionals. 

Given that there is more than one possible functional, how do they differ? Using the calculus 
of variations, and some results from mathematical physics, we will show' that the surfaces which mini¬ 
mize these functionals will be roughly identical in the interior of a region and will differ only along 
the boundaries of a region. As well, die null spaces of the functionals will differ, implying different 
families of most consistent surfaces corresponding to each functional. We know that the minimal 
surface is unique up to possibly an element of the null space. Since we require that the solution 
surface either be unique, or a member of an indistinguishable family of solutions, the size of the null 
space is important in judging the value of a functional. Based on this, we will argue that the quadratic 
variation is to be preferred over the square Laplacian. If we require that the surface pass through the 


known points, we can show that the form of tire stereo data will force a unique most consistent surface 
for the case of quadratic variation, while this is unlikely for functionals such as the square Laplacian. 
Thus, we assert on mathematical grounds that the best functional is one which measures quad¬ 


ratic variation in surface orientation, and the most consistent surface to fit to the stereo data is that 
which passes through the known points and minimizes the quadratic variation. In a later section, 
wc will see examples of lire types of minimal surfaces obtained under quadratic variation and the 
square Laplacian. It will be seen that the mathematical distinction in size of null space has a practical 
consequence, as the types of surfaces which minimize the square Laplacian will be seen to be inconsis¬ 
tent with our intuitive notion of the best surface to fit to the known points, while the surface which 
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minimize the quadratic variation will be seen to be much more consistent with our intuitive notion of 
the best surface. 

4.1.1 The Problem is Well-Defined 

If surfaces are to be compared, by using a functional from the space of surfaces to the real 
numbers, with the purpose of finding the surface that best satisfies the surface consistency constraint, 
it is necessary to ensure that such a goal is attainable. What conditions on the form of the functional, 
or on the structure of the space of functions, arc needed to guarantee the existence of such a “best” 
surface? One key constraint on the functional is given by the following theorem. The main point of 
the theorem is that in order to ensure that die problem is well-defined the functional should have the 
characteristics of a semi-norm. 

Theorem 1: Suppose there exists a complete semi-norm & on a space of functions H, and that © 
satisfies the parallelogram law (for definition, see proof of theorem). Then, every nonempty closed convex 
set E C H contains a unique element v of minimal norm, up to an element of the null space. Thus, the 
family of minimal functions is 

{v + S I s e 5} 

where 

S — {t> — w | w G e} n if 

and JT is the null space of the functional 


Jf = {u | 0(u) = 0}. 

Proof: [See for example Rudin, 1973], Any space with a semi-norm defined on it can be 
associated with an equivalent normed space. Let W be a subspace of a vector space H. For every 
v £ H, let k{v) be die coset of W that contains v, 


ir ( v ) ~ {v -f- u: u £ W). 


These coscts are elements of a vector space H/W called die quotient space of H modulo W. In 
space, addition is defined by 

7r(i>) j- 7r(w>) = : - Tt(v -f- w) 


and scalar multiplication is defined by 


Q7r(t;) = 7 t(qv). 

The origin of the space H/W is 7r(0) = W. Thus, 7r is a linear map of H onto H/W with W as its 
null space. 

Now consider the semi-norm © on die vector space H. Let 

df = {i>: 0(u) — 0}. 


This can easily be shown to be a subspace of H. Let 7 r be the quotient map from H onto ///df, and 
define a mapping ©OZ/df t-f SR, 

©'(^(u)) = 0(u). 

If 7r(i>) = 7r(ui) then &(v — w) = 0. Since |©(v) — 0(u;)| < &(v — w), then ©'(fl/t;)) = 0'(7r(u>)) 
and ©' is well defined on H /JT. It is straightforward to show that ©' is a norm on iZ/df. 

Now we can prove the statement of the theorem. The set E, a subset of H, can be transformed 
into a set E' in the quotient space df/df while preserving the convexity and closure properties. 

The parallelogram law states 

[©'(« + u>)f + [©'(v - w)f = 2[©'(u)] 2 + 2[© , (u;)] 2 . 


Let 


d = inf {©'(w):w ££'}. 


Choose a sequence v n G E' such that &{v n ) t-+ d. By the convexity of E', we know that \{v n -|~ 
v m ) G FJ and so [@'{v n u m )J > id 2 . If v and w are replaced in the definition of the paral- 
lclogram law by v n and v, n , then tlic right hand side tends to id 2 . But [@'(i>„ + v„,)] > id 2 , so one 
must have [©'(t^ — i/ m )] *-+ 0 to preserve the equality. Thus, {t^} is a Cauchy sequence in H/X. 
Since the norm is complete, the sequence must converge to some v G E f , with Q'(v) — d . 

To prove the uniqueness, if v,w G E ! and ©'(v) = d } @'(w) — d tlien die sequence 
{v } w, v } 10 ,...} must converge, as we just saw. Thus v iv and die element is unique. 

We have proven that under the norm 0' on die quotient space U/> NT, the set EJ has a unique 
minimal element. Hence, the structure of the quotient space implies that under the semi-norm © on 
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the space H , the set E has a unique minimal element s, up to possibly an clement of the null space .V. 
In other words, tire family of minimal elements is 

{n + s | sGS} 

where 

S — [v — w | w £ e} n . 


This theorem specifies one set of mathematical criteria needed to ensure that thefe l^ists a 
unique minimal element. Thus, if the surface consistency constraint could be specified by a functional 
that satisfied the conditions of a complete semi-norm, obeying the parallelogram law, it might be 
possible to show that there is a unique coset of “most consistent” surfaces. We would really prefer to 

be guaranteed a unique surface, rather than some set of surfaces. One way to tighten the result of the 
theorem is to require that the functional is a norm. 


Corollary 1.1: If © is a complete norm on a space offunctions H, which satisfies the parallelogram 
law, then every nonempty closed convex set E C H contains a unique element v of minimal norm. 

Proof: If the functional is a norm, tire null space is tire trivial null space, and the result holds 


uniquely, g 


The theorem can be rephrased in tenns of the surface interpolation problem as follows. 
Corollary 1.2: Let the set of known points be given by 



{(xuVi) \ i = l,...,N} 

where the associated depth value is'Fi. Let °J be a vector space of “possible’'functions on & and let 


u = {/£<? | f(x t , Ui ) = Fi i = \,.. V) N} 

so that U is the set of functions that interpolate the known data {/’}. Let © be a semi- norm, which 
measures the “consistency’' of a function f £ X, that is, nv shall say that f is “bet ter" than g if 9(f) < 
9(g)- Ij 0 is a complete semi-norm and satisfies the parallelogram law, then there exists a unique (to 

within a function of the null space of 9) function S £ V that is least inconsistent and interpolates flic 

‘ * 

Jai'L Hi h< <’ the mterpohmufi problem is well-defined. 


TZ9* 









Proof: Clearly U is a convex set since for any /, g £ U, 


[X/ + (1 - \)g](x if Vi ) = (X + 1 - X)Fj = Fi 

for any data point ( Xi, yi). Furthermore, U is closed, since if/ n £ U and/„ i-» /, then f{xi, yi ) = F,- 
and f E U. Then the previous corollary states that U has a unique (to within an element of the null 
space) element of minimal norm, which is exactly the desired “most consistent” surface. | 

This corollary is a translation of Theorem 1 into the problem of interest to us, finding the surface 
most consistent with the known data from the stereo algorithm. It specifies a set of conditions under 
which the interpolation problem is well-defined. Here, the notion of well-defined refers to finding 
a solution to the interpolation problem that is unique, and by unique we mean up to possibly an 
element of the null space of the semi-norm. As a consequence, the extent and structure of the null 
space of any semi-norm chosen to incoiporate the surface consistency constraint will be important 
in determining the utility of that semi-norm. In general, die smaller the null space, the tighter the 
constraint on die family of possible surfaces which can interpolate the known data. For example, if 
the possible variations in the least inconsistent surface were very large (due to the semi-norm having 
a large null space), then the utility of this semi-norm would have to be questioned. On the other 
hand, if the null space is small, and the resulting variations in possible least inconsistent surfaces were 
small, the semi-norm would have to be given serious consideration. We will see examples of possible 
functionals and their null spaces in Section 4.1.4. 

Thus, Theorem 1 and Corollary 1.1 specify two different sets of sufficient, but not necessary, 
criteria for ensuring differing types of uniqueness. In both cases, die criteria applied directly to the 
structure of the functional. Of course, the real trick is to find a functional © which captures our 
intuition of variation in surface orientation and meets the requirements needed to guarantee a unique 
solution. 


4.1.2 The Space of Functions 


Theorem 1 describes a set of sufficient conditions for obtaining a unique family of minimal 
surfaces. The fundamental point is that we require a complete parallelogram serni-norm to ensure 
a unique solution. These conditions precisely define a semi-inner product, and hence the space of 
functions over w Inch we seek a minimum must he a semi-1 lilbcrt space. 
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Corollary 1.3: If C J is a semi-Hilbert space of possible surfaces, and G(v) — p(v, v)i is an inner 
product semi-norm, where p(v, is the semi-inner product on the space If, then there exists a unique 
surface in T (possibly to within an element of the null space of the semi-norm) that minimizes the semi- 
norm © over all surfaces. 

Proof: By the definition of Hilbert space, the semi-norm is complete. It is easy to show that it 
satisfies the parallelogram law from the definition of Q(v) = p(v, w)i Thus, if the space of functions 
is a semi-Hilbert space, then, by Theorem 1, die interpolation problem is guaranteed to have a unique 
minimal solution, possibly to within an element of the null space. | 

Corollary 1.4: If T is a Hilbert space of possible surfaces, and &(v) = p(v, u)5 is an inner 
product norm, where p{v, u)? is the inner product on the space < f, then there exists a unique surface in T 
that minimizes the norm © over all surfaces, g 

4.1.3 The Form of the Functional 


I he major problem is to determine die functional ©. The surface consistency constraint related 
die consistency of a surface to the variation in surface orientation. This forms die first constraint on 
the functional. Theorem 1 states that if die functional is a complete, parallelogram, semi-norm, dlcn 
the problem has a well-defined solution. This forms the second constraint on die functional. Thus, 
if a functional can be found that is a complete, parallelogram semi-norm, and that is a monotonic 
function of die variation in surface orientation, then diis functional will constitute an acceptable 
measure of surface inconsistency. Any surface that is minimal under such a functional would then be 
an acceptable reconstruction of the original surface in space. 

The problem may be considered in die following manner. With every point on the surface 
/, associate a pair of partial derivatives, f x — p,f y = q, and hence, an orientation. Each point 
on the surface may be mapped to a point in a space spanned by p and q axes, the gradient space 
[Huffman, 1971; Mackworth, 1973; Horn, 1977], With each surface patch, one may then associate a 
neighborhood of p-q space by mapping die p and q values associated with each point on the surface 
into gradient space. This neighborhood will be referred to as the p-q neighborhood spanned by die 
surface patch. 

The surface consistency constraint implies that the probability of a zero-crossing occuriite within 


a patch of the surface is monotonically related to the size of the p-q neighborhood spanned' by the 
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surface patch. The surface consistency theorem [Grimson, 1981c] embodies a specific method for 
measuring this probability. Any functional that is monotonically related to the size of the p-q neigh¬ 
borhood will suffice, however. This is important, since it is also necessary that the functional be a 
complete, parallelogram semi-norm. Thus, the problem reduces to specifiying such a complete, paral¬ 
lelogram semi-norm, which monotonically measures the surface consistency constraint by measuring 
a monotonic function of the size of the p-q neighborhood spanned by each surface patch. To find 
the most consistent surface, one need only find the surface that minimizes this functional over each 
patch. Note that the surface will be “most consistent’’ only relative to the functional. There may be 
several functionals that are complete, parallelogram, semi-norms and that are monotonic functions of 
the variation in surface orientation. Each will give rise to slightly different minimal surfaces. 

Of course, there are some constraints on the minimization of the measure of the p-q neighbor¬ 
hood. For example, each surface patch cannot be minimized in isolation. To see this, consider a 
cylindrical (or one-dimensional) surface. Between any two adjacent zero-crossing points, the mini¬ 
mization of the variation in surface orientation (related to the size of the neighborhood in p-q space) 
would result in a single point in gradient space, corresponding to a planar surface between the known 
depth values. The problem with this simple method of reducing surface inconsistency is that it does 
not account for the interaction of surface patches. In particular, such a method would result in 
a piecewise planar surface approximation. For any three consecutive zero-crossing points, such a 
method would introduce a discontinuity in surface orientation at the middle zero-crossing. This is 
unacceptable since the surface is required to be twice differentiable. Thus, there are some constraints 
on the manner in which the neighborhoods in p-q space arc minimized. 

We are thus faced with the following problem. We know from the boundary conditions 
provided by the stereo algorithm that the surface must pass through a set of known depth points 
located at the zero-crossings of the convolved images. We further know that at all other points in 
the image, the surface cannot change in such a manner as to cause an additional zero-crossing. With 
each surface portion, we can associate a measure of the probability of that surface implying a zero¬ 
crossing in the convolved image intensities. Since between zero-crossings, there are no other zero- 
crossings, this gives a measure of the inconsistency of this particular surface portion. To choose 
the least inconsistent surface, we must reduce this probability, as measured over all portions of the 
surface. 



USING THE SURFACE CONSISTENCY CONSTRAINT 


4.1.4 Possible Functionals 


In this section, possible functionals © are considered, seeking complete, parallelogram semi- 
inner products where possible, since this will guarantee that the solution is unique to within the null 
space. However, it is important to stress that there may be several viable alternatives. The computa¬ 
tional theory argued that the functional should measure the “consistency” of the surface. The attempt 
here is to define a measure based on this. The measure should be in a form that allows the constraints 
on the problem to be easily expressed. Also, the measure should be a semi-inner product on a semi- 
Hilbert space, in order to ensure a unique solution. We begin by considering several candidates in 
light of these requirements. 

Case 1: One Dimension 


For ease of discussion, the case of a cylindrical or one-dimensional surface will be considered 
first. By a cylindrical (or developable) surface we mean a second differentiable surface oriented along 
the y axis such that<9//c9y = q = c, for some constant c. 

Example 1.1: The variation in the normal to the curve is clearly related to its curvature. One 
could thus consider using a functional that directly measures curvature and integrates this measure 
over an area of the surface. To ensure that the functional is positive, this suggests using a functional of 


the form: 


©i(/) 



k 2 ds > = 



/ 


xx 


(i +m 


c lx 


where fc is the curvature of the curve at a point. (Recall that the subscript here implies a partial 
derivative, so that f 2 xx — (d~f /dx 2 ) 2 .) Although this is perhaps the most “natural” definition of a 
functional, it is not a semi-norm, and hence it is considered to be unacceptable. To see why it is not a 
semi-norm, consider the following. If / is in the space of surfaces, then 


©i («/) 



«©i(/). 




This condition will be true only if f x == 0. While this is certainly far too restrictive a condition to 
place on the possible surfaces, it does suggest a possible alternative. 

Example 1.2: A second choice is the quadratic variation of the gradient, which may be measured 

by: 

©2 (/)= 

Note that it is a close approximation to the curvature of the curve, provided that the gradient f x is 
small. @2 is a semi-norm, so the surface that minimizes this norm will be unique to within an element 
of the null space of the semi-norm. The null space of @2 is the set of all linear functions: 

Jf = span{l, x} 


where 

span{t>i,. •., {gjDj -I - • • • ~f- ct m v m | Q[,..., dm £ 

Not only does this form of the functional satisfy the mathematical criteria of a complete, paral¬ 
lelogram semi-norm, it has a strong relationship to the “natural form ©i(/), since the restiiction of 
small f x is acceptable. Those cases in which f x is not negligible correspond to situations in which the 
surface is rapidly curving away from the viewer. These situations are such that the curvature of the 
surface will cause it to be invisible in one eye — giving rise to occluding boundaries. In general, we 
can assume diat the image docs not consist solely of occluding boundaries, so that such occurrences 
will be rare in an image. Moreover, between such points, the surface will satisfy the restriction and the 
above semi-norm is well-suited to the interpolation problem. 

Case 2: Two Dimensions 

To each of the examples of the one-dimensional case, there is an analogous two-dimensional 


case. 


Example 2.1: As in the one-dimensional case, one possibility is to measure the curvature of the 
surface. The curvature of a surface is usually measured in one of two ways. 

For any point on the surface, consider the intersection of the surface with a plane containing 
the normal to the surface at that point. This intersection defines a curve, and the curvature of that 
curve can be measured as the arc-rate of rotation of its tangent. For any point, there arc infinitely 
many normal sections, each defining a curve. As the normal section is rotated through 2n radians. 
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all possible normal sections will be observed. There are two sections of particular interest, that which 
has the maximum curvature and that which has the minimum. It can be shown that the directions of 
the normal sections corresponding to these sections are orthogonal. These directions are the principal 
directions and the curvatures of the normal sections in these directions are the principal cutyalures, 
denoted K a and kv It can be shown that the curvature of any other normal section is defined by the 
principal curvatures. 

There are two standard methods for describing the curvature of the surface, in terms of the 
principal curvatures. One is the first (or mean) curvature of the surface 

/ = «a + 

The other is the second or Gaussian curvature of the surface 

K — K a • Kb. 


For a surface defined by the vector {x, y, f(x, y)}, these curvatures are given by 


J 


d_ 

dx 


u 


■f 


k 


fl + V y^~+ fl + /y / 


and 


K 


fxxfyy f: 


y 


v 


(i+ a+fi) 

Thus, there are two possibilities for the functional. One is to measure the first (or mean) 
curvature of die surface, 


e 3 (/) 


// 


J 2 dxdy 



As in the one-dimensional case, this is not a semi-norm, since 

( 


0;r(a/) 


a 


( 


// 


[L A l + a 2 /*) + f !W ( l -f a 2 fl) — 2af,afJ xv ^ 
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(I 4- oV; f o 2 /*) 
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dxdy 





However, if f x and f y are assumed to be small, then it is closely approximated by a semi-norm. In this 
case, consider 

© 4 (/ ) = {/ / (vrfdxdy 

This is a semi-norm, with null space consisting of all harmonic functions. 

A second possibility for reducing curvature is to reduce the second or Gaussian curvature, 

05 (/) = 

By an argument similar to the above, it can be shown that this is not a semi-norm. Note that by using 
the above approximation of small f x and f y , we get the functional 



We will return to this form later. 

Example 2.2: As in the one-dimensional case, one can also consider the quadratic variation. The 
quadratic variation in p = f t is given by 

/ / { p z+ p i) dxdy 

and the quadratic variation in q = f XJ is given by 

/ / fal + dxdy ' 

If the surface is twice continuously differentiable, then p y — q x , and by combining these two varia¬ 
tions, one obtains the quadratic variation: 

0 ? (/) = if / (fix + 2 fly + f 2 yy ) dxdy^ . 

Again, as in the one-dimensional case, this is a complete semi-norm that satisfies the parallelogram 
law. Hence, the space of interpolation functions has an element of minimal norm, which is unique up 
to an element of the null space, where the null space is the set of all linear functions: 

Jf — span{i, x, y}. 




Duehon (IT/5, 1976) refers to the surfaces that minimize this expression as thin plate splines 
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4.2 Where Do We Stand? 



Wc have seen that for the general surface interpolation problem, there are two constraints on the 
possible functionals. One is that the functional must measure a monotonic function of the variation in 
surface orientation. The other is that the functional should satisfy the conditions of a complete paral¬ 
lelogram semi-norm, or equivalently, a semi-inner product. If the functional satisfies these conditions, 
then we know that there will be a unique family of surfaces that minimize this functional and hence 
form a family of best possible surfaces to fit through the known information. In the examples sketched 
above we saw that there are at least two possible candidates for this functional, namely the square 
Laplacian, 

0 4 (/) = [j J (V 2 /) dxdy 

and die quadratic variation, 

0 7(/) = {/ / {fix + Vly + fly) dxdy 




'I'llere are several points still to consider. Are there other possible functionals? How do the mini¬ 
mal solutions to these functionals differ? What criteria can be applied to determine which functional is 


best suited to our surface interpolation problem? What is the best functional under those criteria? In 
die remainder of this chapter, we shall consider these questions in detail. The point we shall develop is 


diat the appropriate functional to apply is die quadratic variation, and thus die surface that minimizes 
diis functional is most consistent with the imaging information. 


4.3 Are There Other Functionals? 


We have determined at least two functionals diat meet our conditions. Are there other possible 


functionals, and if so, how do their minimal solutions differ from those of the 

* 

the quadratic variation? 


po< 

are Laplacian arid 


To answer this question, we rely on a result of Brady and Horn [1981], which we sketch below. 
Recall that the basic conditions on the functional were that it measure''# monotonic function of the 
variation in surface orientation, and that it be a semi-inner product. The first requirement suggests 
that tlo lunctional must involve terms that arc functions of the second order partial derivations of the 






surface, since such terms will be related to the variation in surface orientadon. The second require¬ 
ment is needed to ensure the uniqueness of the solution. Recall that (x(f, g) is a semi-inner product 
if 

1- M(/> 9 ) = »{g> f), 

2- M/ + 9, h) = n{f, h) -f n{g, h), 

3. n{af, g) = Q/i(/, g), 

4. Kf, f ) > o, 

and that given a semi-inner product n{f, g), one can define the desired functional by 0(/) = 

The difficult condition to satisfy is (3), which implies that the semi-inner product should not 
contain any constant terms. The conditions taken together imply that we should consider any quad¬ 
ratic form as a possible semi-inner product: 

M(/) 9) — J J&fxx9xx "I - Pfxy9xy 7 fyy9yy ~t” 

~\~fi(fxx9xy ~f~ fxy9xx) “H e ( fxx9yy “f" fyy9xx) "f" $(fxy9yy "f” fyy9xy)- 

Thus, the corresponding functional will have the quadratic form: 

®(/) — J J aflx + Pfly + 7 fly + 2 Sf xx f x y -f- 2 efxxfyy + 2 ifxyfyy 


The final condition we apply to die functional is that it be rotadonally symmetric. This follows 
from the observation drat if the input is rotated, the surface that fits die known data should not change 
in form, other than also being rotated. 

Minimizing the quadratic form of die functional &(/) can be considered as finding die mini¬ 
mum over the integral of the function (A f) r MAf where A/is die vector: 


and M is the symmetric matrix 




a 6 e 

6 (3 < 

e f 7 
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IfR is a rotation matrix, then the condition of rotational symmetry is given by 

(RAf) r M(RAf) = Af r MAf. 

Vector algebra implies that we must have 

R T MR = M or R r M — MR~ l . 

Equating elements shows that the matrix M must have the form 

f+e 0 

0 0 
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variation since: 
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There are two important consequences of this fact. The first is that the set of all possible func¬ 
tionals forms a vector space, since if M, and M 2 satisify the conditions, then so does oM x + vM*. 
1 he second is that this vector space of operators is spanned by the square Laplacian and the quadratic 

^ ~f~ 6 0 6 

0/3 0 

6 0 § e 

The first term of the sum corresponds to the square Laplacian while the second corresponds to the 

quadratic variation. Thus, for e = 1 and 0 = 0, the functional reduces to square Laplacian. For 

e •— 0 and 0 = 2, the functional reduces to quadratic variation. Finally, if e == 1 an d 0 = —1 

we obtain a functional which corresponds to the approximation to the integral of square Gaussian 
curvature derived in Section 4.1.4, 

Thus, we have answered our second question. There are other possible functionals, but they are 
all linear combinations of the two basic functionals, the square Laplacian and the quadratic variation. 

4.4 How Do the Functionals Differ? 

Given that there are many possible functionals, all linear combinations of the square Laplacian, 

0 i, and the quadratic variation, 0 7 , we must consider how the solutions to the square Laplacian and 
the quadratic variation differ. In other words, is there any noticeable difference in the surfaces that 
minimizes these two functionals, subject to fitting through the stereo data? To answer this question, 
we shall rely on the Calculus of Variations, (sec. for example, Courant and 1 filbert [195.1] and Forsyth 

I- i be salient points arc outlined below. 







4.4,1 Calculus of Variations 


The calculus of variations is frequently used to solve problems of mathematical physics, and is 
applicable to our surface interpolation problem. In particular, we can use the calculus of variations 
to formulate differential equations associated with problems of minimum energy. Suppose we are 
given a thin elastic plate, whose equilibrium position is a plane, and whose potential energy under 
deformation is given by an integral of die quadratic form in the principal curvatures of the plate. 
We can consider the interpolation problem as one of determining the surface formed by fitting a 
thin elastic plate over a region ®R»- (with boundary T) and through the known points. Using a small 
deflection approximation, the potential energy is given by 



The solution to the interpolation problem is then die surface which has the minimum potential 


energy. 


The calculus of variations can be used to characterize this problem by providing a set of 
differential equations (called the Euler equations) which the solution surface must satisfy. It can be 
shown (see Courant and Hilbert, [1953, p.251]) that the Euler equations for the interior of any region 
91) are given by 


- fxXXX ~1~ 2/j-xyy -)- 


’yyyy 


0 , 


except at the known points. Along die boundary contour T of the region, the solution surface must 
satisfy the equations (called the natural boundary conditions)'. 

M[f) — V 2 / + (1 — lAifcxx] + 2 fj-y'X^Vs ~f fyyVa) ~ 0 

f A*) Q~^fxx x n%s 4“ fxyi^nVs 4” sVn ) 4" jyyl/nUa ^ ~ 0, 

where d/dn is a derivative normal to the boundary contour, d/ds is a derivative with respect to 


arcicngth along die boundary contour and x s , y H and x n , y n are the direction cosines of the tangent 
vector and the outward normal respectively. The constant fj, denotes die tension factor associated with 


the elastic material of the plate. 

T here are two subcases of particular interest. In the first case, suppose that the tension factor is 


given by jt — 1. T he energy equation reduces to 
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which is simply the square Laplacian condition derived previously. The Euler equation is the bihar¬ 
monic equation V 4 / = 0 while the natural boundary conditions are 

V 2 / = 0 

£ v2 /=°' 

along die boundary contour T. In the second case, suppose that die tension factor is given by (i — Q. 
The energy equation reduces to 

J jjAx + 2 fl y + fl v )dxdy 


which is simply the quadratic variation condition, also derived previously. The Euler equation is 
identical to that of the square Laplacian, namely the biharmonic equation V 4 / == 0. The natural 
boundary conditions are different, however. They are given by 



In the simple case of a square boundary, oriented with respect to the coordinate axes, the boundary 
conditions reduce to: 



2 fxiy fyyy 

along the boundary segments parallel to the x axis, and 
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along the boundary segments parallel to die y axis. 

These boundary conditions can be straightforwardly simplified to: 
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fxxy 
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along the boundary segments parallel to the x axis, and 
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along (he boundary segments parallel to the y axis. 
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We have thus answered our question. For both the square Laplacian and the quadratic 
variation, the Euler equations are identical in the interior. The only difference to be noted in the 
extremal solutions to the two functionals will be observed at the boundaries of the surfaces. When we 
look at examples of solving these equations, this difference will become important. 

There is a second manner in which the minimal solutions to the functionals will differ, in part 
related to the difference in boundary conditions of the two solutions. While the form of the minimal 
surface under either functional is roughly the same, except at the boundaries, this minimal surface will 
be uniquely determined only to within an element of tire null space of the functional. This will be an 
important factor in determining which functional is best suited to our problem, since we would like 
the boundary conditions provided by the stereo data to completely determine a unique solution. The 
null spaces of tire two functionals differ greatly, since the null space of the quadratic variation is the 
space of all linear functions, while the null space of the square Laplacian is the much larger space of 
all harmonic functions. We shall consider the effect of this difference later. 

4.5 The Best Functional 

Given that the set of possible functionals forms a vector space spanned by the square Laplacian 
and the quadratic variation, what criteria can be applied to determine the best functional? Since the 
Euler equation for both of these basis operators is the biharmonic equation V 4 / = 0, the same will 
be taie of any other operator in the vector space. This implies that aside from boundary conditions at 
the edge of tire region being interpolated, the surfaces provided by any operator in this space will be 
basically tire same. 

This being the case, the only other characteristic that can distinguish between the possible func¬ 
tionals is the size of their respective null spaces. Let us denote the null space of the square Laplacian 

* 

by jV, (the space of all harmonic functions) and the null space of the quadratic variation by Jf '2 (the 
space of all linear functions). Note that Jf 2 is a subspacc of Jfj. Now the null space for any linear 
combination of these two operators must contain at least the space spanned by the intersection of the 
two null spaces Jf| and Jf 2 . Ilcnce, the null space of any other operator must consist at least of the 
linear functions. Thus, no possible operator can have a null space smaller than that corresponding to 
quadratic variation. 
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The importance of the null space is that it helps determine the family of surfaces that are mini¬ 
ma 1 under the functional. The requirement we impose on the best functional is that the member of 
this family corresponding to the minimal surface be uniquely determined, when combined with the 
requirement that the surface must pass through die known points provided by the stereo algorithm. 
Clearly the smaller the size of the null space, die fewer the requirements we must impose on the 
output of die stereo algorithm in order to insure a unique solution. 


We may view this criterion in the following manner. Initially, we start with the space of all pos¬ 
sible functions, namely, die space of all second differentiable functions of two real variables, denoted 
C' 2 (3? 2 ). If we restrict our attention to those surfaces diat pass through the boundary conditions 


imposed by the stereo or structure-from-motion data, we define a convex subset U of this space. If we 
define a functional on this space, the set of surfaces that are minimal under the functional ate given by 


where 


[y -j- w | toG W\ 


W = {v — u j u £ U) D Jf, 


for some minimal surface v £ U . We are guaranteed a unique solution to the interpolation problem 
if W is empty (or equivalently, consists only of the null surface, defined to be zero everywhere). 
The key question becomes: can we have two surfaces that fit through the known points, havedhe 
same measure of surface consistency (the same value as measured by the functional) and differ by an 
element of the null space? If not, we arc done, as the minimal surface is then guaranteed to be unique, 
fhus, the structure of die boundary conditions provided by the stereo algorithm (or the structure- 


from-motion algorithm) may be important in deciding which functional is more suitable. Clearly, the 
smaller die subspace of minimal surfaces, the more likely we are to have a unique minimal surface 


fitting the known data, as the set W is more likely to be empty. 


Recall diat die null space of die square Laplacian 



is the set of all haimonic functions. We wish to know what form of the boundary conditions Will 
uniquel) determine the harmonic function. 'This problem is known as die Dirichlet problem hi 




classical analysis, and it has long been known that if the boundary conditions consist of a series of 
closed, bounded Jordan curves, then the harmonic function is uniquely determined. These are, of 
course, sufficient, but not necessary conditions. It would appear, however, from these conditions 
that it is unlikely that the boundary conditions provided by the stereo algorithm will be sufficient to 
uniquely determine the component of the null space. This follows from the observation that the stereo 
algorithm is capable of providing boundary values at scattered points in the image, corresponding to 
the zero-crossings of the convolved image, while the Dirichlet problem is uniquely determined if the 
boundary values form a closed, bounded Jordan curve. Thus, in the case of the square Laplacian, the 
best we can do is determine a family of most consistent surfaces, which differ by harmonic functions. 
Referring back to our earlier question, we see that in this case, we could have two (or more) surfaces 
which fit through the known points, have the same measure of surface consistency, and differ by an 
element of the null space. The variation in such a family of surfaces is not consistent with our intuitive 
notion of indistinguishable surfaces, that is, the difference in tire shape of two surfaces which have 
identical minimal values for the square Laplacian measured over the surface can be noticably large. 
As a consequence, we consider the square Laplacian to be a poor choice for the functional. 

On the other hand, the null space of the quadratic variation 

0 (/) = if / {fix + 2 fly + 4y) dxdy^j 


is the set of all linear functions. The boundary conditions required in this case to uniquely determine 
the component of the null space are much simpler. In particular, if the stereo algorithm provides at 
least three non-colincar points, the element of the null space is uniquely determined to be the null 
surface (the surface which is zero everywhere). It is clear that in almost all imaging situations, the 
stereo algorithm will be capable of providing the necessary boundary conditions, and thus the most 
consistent surface is uniquely determined. 

Thus, we have seen that the only possible functionals that can be used to measure the surface 


consistency constraint form a vector space spanned by the square Laplacian operator and the quad¬ 
ratic variation operator. The minimal surface for any such operator satisfies the biharmonic equation 
in the interior of the region being interpolated, but along the boundaries of the region it may satisfy 
different differential equations than the minimal solution of any other operator. In general, this 
implies that the solution surfaces cot responding to different operators will generally differ in shape 
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only near the boundaries. To distinguish between possible operators, we examined the ffinn of their 
null spaces. We showed that the operator with the smallest null space was the quadratic variation. 
Further, the stereo data is in general sufficient to uniquely determine the component of the null space 
corresponding to the minimal surface. That is, the surface that minimizes the quadratic variation, sub¬ 
ject to passing through the known points provided by the stereo or structure-from-motion algorithms, 
is uniquely determined. 

4.6 The Computationai PrQljl^rrt 


By combining the results of the last two chapters, it is now possible to state the computational 
theory of the problem of interpolating visual surface information. 

The Interpolation of Visual Information: Suppose one is given a representation consisting of 
surface information at the zero-crossings of a Primal Sketch description of a scene (this could be either 
from the Marr-Poggio stereo algorithm , or from the U liman structure-from-motion algorithm, or both). 
Within the context of the visual information available . the best approximation to the original surface in 
the scene is given by the minimal solution to the quadratic variation in gradient (or surface orientation) 



Such approximations arc guaranteed to be uniquely “best" to within an element of the null space of the 
functional 0 . In the case of quadratic variation , the null space is the set of all linear functions. Provided 
the set of known points supplied by the stereo algorithm or by the stiveture from motion algorithm 
includes at least three non-colinear points ., the component of the surface due to the null space is uniquely 
determined to be the null surface. Hence , the surface most consistent with the visual information is 


uniquely determined. 

It is worth noting that although the above statement is phrased in terms of zero-crossings ob¬ 
tained from images convolved with V 2 G filters, the heart of the statement is much broader in scope. 
The key point is that to interpolate any surface representation which contains explicit information 
only at sparse points in the representation, we need to find the “most conserva)|iv^ ,r surface consistent 
with the input information. This implies that between the known surface points, the surface should 
vary a > hiilc as possible. Thus, whether those known points correspond to zero-crossings, edges, or 
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some other basic descriptor of image changes, the surface interpolation algorithm should construct the 
surface which minimizes variation in the surface between the known points. 

It is interesting to compare the criteria for surface interpolation developed here, as well as the 
specific theory of surface interpolation stated above with the work of Barrow and Tenenbaum, [1981]. 


5. Constrained Optimization 


Our goal throughout this paper has been to find the surface that best fits the known data 
provided by tire stereo algorithm or tire structure-from-motion algorithm. In the preceeding sections, 
we saw that such a “best” surface exists and is characterized as tire surface that minimizes the func¬ 
tional of quadratic variation, measured over the surface. The problem we address now is how to find 
that minimal surface. What is meant by “finding the minimal surface”? Our goal is to derive an 
algorithm that computes explicit surface values (such as depth, or relative depth) at all points on a 
discrete grid, m points on a side. (That is, the scene will be partitioned into an m X m grid, and to 
each grid point, we want to associated a surface value.) 

In general terms, we arc seeking an algorithm to solve an optimization problem — we want to 
compute the values of a set of parameters that optimize some function. In our case, the parameters 


correspond to tire surface values at the grid points, and tire function to be optimized is tire measure of 
a discrete correlate to quadratic variation over the surface. We will restrict our attention to tire class of 
optimization algorithms that satisfy three simple criteria of biological feasibility — parallelism, local- 
support, and uniformity. These three criteria, together with the form of the input data — scattered 
contours of known points — preclude many of the possible techniques for solving an optimization 
problem, but also suggest the use of techniques such as those of mathematical programming. 


5.1 The Role of Algorithmic Criteria 

An essential problem for any computational theory about early visual processing is to determine 
the implicit assumptions used by the visual systenr to perform tire computation. These are valid 
assumptions about die environment that arc explicitly incorporated into the computation. UUman’s 

rigidity assumption in visual motion perception [Utlman, 1979a], Marr and Hildreth s condition of 

# 

linear \ariation and spatial coincidence assumption (Marr and I lildreth, 1980], and Marr and Poggio s 


* 
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assumptions of uniqueness and continuity [Marr and Poggio, 1979] arc three examples. Such assump¬ 
tions may be considered as computational constraints on die problem. 

There is a second set of criteria diat may be applied to any theory and, more importantly, to any 
algorithm for a theory. They deal with the requirement of biological feasibility, and are important if 
one is to describe a model of the human system. They will be termed algorithmic criteria. Ullman 
[1979b] has listed a number of such criteria that should apply to any biologically feasible algorithm. A 
similar set is briefly sketched here. 


Parallelism 

The need to process large amounts of input data in short amounts of time implies the use of 
computations that can be implemented in a parallel manner, using a large number of intercon¬ 
nected processors. 

Local Support 

If the number of processors involved in die computation is large, it becomes infeasible to con¬ 
nect each one to all of the others. Radier, there should only be local connections between the 


processors. Here, “local" means not only that die number of connections be small, but also that 
since die information being processed has a two-dimensional plane as an underlying coordinate 
system, die connections should also be local in a spatial sense. If die support of a function, 
defined on a two-dimensional grid, is die set of points on the grid diat contribute in a non¬ 
trivial manner to die computation of the function, dien our requirement is that the processors 
implementing our computation must have local support. 

Uniformity 

One final consideration, though not as critical as die first two, concerns the uniformity of the 
processors. If it is possible, an algorithm that utilizes parallel networks of identical processors 
will be favored over other algorithms. Such a requirement is not crucial, however. 


Although the original motivation for such restrictions on an algorithm arises from consideration 
of the human visual system and restrictions of biological feasibility, they could apply equally well to 
other types of image processing systems. As such, they are taken as general criteria for the computa¬ 
tions to be investigated, regardless of whether the algorithm serves as a model of the human system. 

\ 

In designing algorithms to solve a particular visual process, the first step is to seek a method that 
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solves the problem. Having done so, one can then consider its applicability in light of the criteria 
outlined above, and possible modifications to the algorithm in order to satisfy those criteria. 

5.2 Mathematical Programming 


The surface interpolation problem, as we have developed it, can be viewed as an optimization 
problem; that is, the solution to the surface interpolation problem is equivalent to the minimal point 
of an objective hypersurface. There is a large body of literature on methods for finding the solution 
to optimization problems in general. In considering which one to apply to our problem, we take 
two factors into account. The first is the form of the input data supplied by stereo (and possibly 
also stmcture-from-motion). The key point is that the set of known points will generally consist of 
a series of zero-crossing contours, along which the depth is known. These contours are not closed, 
since the horizontal components will have no disparity value, and hence no depth value, assigned to 
them. Further, they tend to be scattered at random rather than distributed uniformly over the grid. 
(This removes many methods from further consideration.) The second factor is the architecture of the 
possible algorithm, outlined by the algorithmic criteria of the previous section. As a consequence of 
these two factors, many of the possible methods, while perfectly valid solutions mathematically, ate 


not readily applicable to our problem. A comprehensive review of tire types of methods may be found 

in Schumaker [1976] (see also Grimson [1980,1981b]). 

Given that an algorithm used to solve the visual surface interpolation problem must be local, 

parallel and uniform, and must be capable of dealing with scattered input data, one of the best 
methods to use is mathematical programming, and in particular, nonlinear programming. Gilman 
[1979b] has shown that many problems of relaxation and constrained optimization can be solved by 
local nonlinear programming processes (see also Hummel and Zucker [1980]). Indeed, a method 
similar to that outlined here was used by Ullman in solving die motion correspondence problem 

[Ullman, 1979a]. 

Recall that the problem with which we arc faced is to find the surface that minimizes a func¬ 
tional measuring surface consistency. The most likely candidate for this functional is the quadratic 
variation. The boundary conditions with which the surface must agree are depth values along 
the zero-crossings, given either by the Mnn-Poggio stereo algorithm or the Ullman structure-lrom- 
,notion algorithm. These boundary conditions can be met in one of two ways. If die surface is 
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required to fit exactly through the boundary points, the problem is one of surface interpolation. If 
the surface is required only to pass near the known points, while minimizing some error function, the 
problem is one of surface approximation. In the following sections, both problems will be examined. 

Two cases of optimization will be examined: unconstrained optimization, which is applicable 
to the approximation problem, and constrained optimization, which is applicable to the interpolation 
problem. Standard algorithms for computing the solution to the optimization problem for each case 


are sketched below. For more details on mathematical programming, see for example Luenbcrger, 
[1973]. 

The Conjugate Gradient Algorithm 

Starting at any point xo G E n define do = 

x fc+i - 

(Xk - 
dfc+i - 


Pk 


-go = b — Qxo and 
x fc 4* Qfcdfc 

gfc4 

d[Qd fc 

—gfc +1 4" Pk&k 

Sfc+iQdfc 


where g fc = Qx fc — b. g 


We will apply this algorithm, for the case of unconstrained optimization, to the problem of 


visual surface interpolation in the next section. Because the algorithm is solving an unconstrained 
optimization problem, it will be applicable to die surface approximation problem, where the surface js 


required to pass near, but not necessarily through, the known points. 
Gradient Projection Algorithm 

The algorithm may be summarized as follows. 


1 . 

2 . 


Find the subspace of active constraints M, and form the matrix A 

i 


Calculate the projection matrix IT 


1 


aJ(a„aJ) 


l P 


P* 

and the direction vector 


-PfcV /(x) r . 

3 . If d 7 ^ 0 , find the scalar that maximizes 


W 


{a: x ~f~ ad is feasible} 


and the scalar c 2 that minimizes 

{/(x ~f- ad): 0 < a < c\} 







as a function of a. Set x to x -f- C 2 d and return to (1). 

4. If d = 0, find 0 = -(a p a£) A p V/(x) r . If ft > 0, for all j corresponding to active 
inequalities, stop, as x satisfies the Kuhn-Tucker conditions. Otherwise, delete the row from A p 
corresponding to the inequality with the most negative component of/3 and return to (2). i 

We will apply this algorithm, for the case of constrained optimization, to the problem of visual 
surface interpolation in the next section. Because the algorithm is solving a constrained optimization 
problem, it will be applicable to the surface interpolation problem, where the surface is required to 
pass through the known points. 


6. The Interpolation Algorithm 


The algorithms of the previous section can now be applied to the problem at hand, the inter¬ 
polation (or approximation) of visual surfaces from the stereo data. Recall that the interpolation 


problem was stated as: 

The Interpolation of Visual Information: Suppose one is given a representation consisting of 
surface information at the zero-crossings of a Primal Sketch description of a scene (this could be either 
from the Marr-Poggio stereo algorithm, or from the Ullman structure-from-motion algorithm, or both). 
Within the context of the visual information available, the best approximation to the original surface in 
the scene is given by the minimal solution to the quadratic variation in gradient 
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(where s denotes a surface). Such approximations are guaranteed to be uniquely “best” to within an 
element of the null space of the functional 0. In the case of quadratic variation, the null space is the 
set of all linear functions. Provided the set of known points supplied by the stereo algorithm or by 
the structure-from-motion algorithm includes at least three non-colinear points, the component of the 
surface due to the null space is uniquely determined to be the null surface. Hence, the surface most 


consistent with the visual information is uniquely determined. 

We shall consider solving this optimization problem both in the case of interpolation (the sur¬ 
face passes exactly through the data) and in the case of approximation (the surface passes near the 

data). Although the algorithms could be either applied to the square I aplacian or to the quadratic 

* 

variation, we shall examine only the case of the quadiatic variation. 










CONVERSION TO THE IMAGE DOMAIN 


6.1 Conversion to the Image Dolftaln 

The problem, as stated, lies clearly within the domain of continuous functions. Yet this is 
not appropriate to the case at hand. In order to establish an algorithm for transforming the visual 
information into a representation of the surface shape, a number of conversions must take place. 

The first point to note is that the functional ©(s) consists of a square root. (Note that we will use 
s to denote die surface which we are fitting to the known points, to distinguish it from the notation 
of s used in the previous chapter to denote the objective function.) However, clearly any function 
which minimizes the functional Q(s) also minimizes the functional 0 2 (s), and vice versa, provided 
that the functional is always positive in value. Hence, without loss of generality, one may consider the 
minimization of 

©(«) = / /(*** + 2s *v + s lv) dxdy - 

Throughout this section, this will be die functional to be minimized. 

In order to determine the structure of die algorithm, one must address the issue of die form of 
the output representation, since that will have a major effect on die actual algorithm. In this case, it 
is desired that the surface information be specified only at particular places within the representation 
of the scene. This will be accomplished by requiring diat die interpolation algorithm compute explicit 
depdi values at all locations within a Cartesian grid of uniform spacing. Although both the spatial 
resolution of die grid and the resolution of the depth information stored within that grid should be 
determined, it is considered that such parameters are not critical to die development of die algorithm. 
Hence, these parameters will be assigned arbitrary values. 

The continuous functional must now be converted to a form applicable to a discrete grid. 
Without loss of generality, assume that die grid is of size m X m. Each poin t on the grid may be repre¬ 
sented by its coordinate location, so diat the point (i, j ) corresponds to die grid point lying on the i th 
row and the j th column. At each point (i,j) on die grid, a surface value may be represented by 
Each such surface value may be considered as an independent variable, subject to the constraints 
of the problem, of course. Using eidier row major order or column major order, these variables 
S(ij) may be considered as a vector of variables, denoted s — {«(o,o)i s (o,t)> • ■ • >%»*—i),(in-i))}* 
(For clarity, a straightforward transformation from the doubly-indexed grid coordinates into a singly- 
indexed vector coordinate can be established, f or example, the grid point (i, j) can be mapped 
to the m lor point k =- mi -f j, and the vector point k can be mapped lo the grid point (t, j) --- 






([k/m\,k — m[k/m\).) It is this vector which will be modified using the non-linear programming 
algorithms, and the final value of which will form the solution to the optimization problem and thus 
correspond to the desired interpolated surface. 

Having converted the surface function to a discrete grid format, it is now necessary to convert 
the objective function of the optimization problem to a discrete format. This means that the 
differential operators must be converted to difference operators. There are many possible dis¬ 
crete approximations to the differential operators. We choose to use the following approximations 

[Abramowitz and Stegun, 1965, p. 884]. 

The second partial derivative in the x direction may be approximated by 



— 2s( JJ ) + S(i— i,j)] + 0(h?) 


2 

where h is the grid spacing, and 0{h 2 ) indicates that the approximation is valid to terms of order h . 
Similarly, the second partial derivative in the y direction may be approximated by 


d \jJ) 
d y 2 



2s (« ,j) + fy,.?--!)] o(h 2 ). 


The cross second partial derivative can be approximated by 


d 2 s, 


(i,i) 


dxdy 4h' 2 


-2[s (i+ jj + i) i) ®(*Ti >3 —i) "C hi i)] ”1” ^(^ )• 


Note that such approximations have frequently been used in the image processing literature, (for 
example, see die reviews of Davis [1975], Rosenfcld and Kak [1976], Pratt [1978]). Little is known of 

A 

die affect of diese approximations on the behavior of die result, 

Having converted the surface function and the differential operators, one must convert the 

double integral to a discrete equivalent. This can easily be done, by using a double summation 

over the finite difference operators applied to the discrete grid. One minor point is noted. While 

it is straightforward to form the discrete equivalent to the double integrals J f s 2 xr dxdy and 

f fa 2 dxdy, the cross term 2 / / a* dxdy is handled differently. In particular, consider a second 

grid, superimposed on the first, which bus twice the spatial resolution of the titst (that is, nil integer 

points are represented as are all points (^, ^)). for the cross term, we shall apply the finite dilTeicnce 

* 

operator to ail half integral points on this finer grid. The combination of these operators yields the 
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discrete objective function: 


minimize 


m —2 m —1 / 

EE 

i={ j~0 ^ 
m —1 m —2 / 

E E( 

t=0 7=1 V 


s (i-i,j) ~ 2+ S(i+ij) 




m—2 m—2 ✓ \ 2 

+ 2 X] £ ( ~ ^H-U) — s (Li+t) + S (i+I,i+1)) • 

i—0 j= 0 V ' 

Finally, the characterization of the constraints must be considered. The case of interpolation 
will be considered first, where the interpolated surface is required to pass through the known points. 
Let if = {(LJ) I there is a known depth value at the grid point ( i,j )} be the set of grid points 
for which a depth value is known. Then the constraints on the optimization problem have the form 
~ c {i,j ) — 0 for all points (i, j) in the set If, and where the C(ij)S are a set of constants reflecting 
the stereo data. Note that the set of constraints are all equality constraints. 

6.2 The Gradient Projection Interpolation Algorithm 

It is now possible to consider applying the gradient projection method to this problem: 


minimize 


ra —2 m —1 / 

£ £ 

i= l ?=o v 


S (i—1,3) — 2 + fy+U) 


m — 1 m —2 ✓ 

2Lj l tS ( i > 

4=0 7=1 V 


j— 1 ) ^ S (hJ ) J-f- 1 ) 


■in —2 m —2 ✓ 

+2 £ £ (^f,3)~^t+l,j)~^«,3+l) + ^i4-X,J+l)) • 
t=o y=o ^ ' 

subject to s (u) — c (i ,j) = 0, V(t, j)'£ if. 

\ 

\ 

To apply the method of gradient projection, it is necessary to determine the set of active con¬ 
straints, and the projection matrix onto the subspace spanned by the active constraints. Clearly, since 
all the cuii:.liaims are equality constraints, they are all active at every iteration. Thus, the matiix ,V 




(where p = |if|) has rows consisting of a 1 in the position corresponding to die grid point (i, j) for 
(i,j) 6 if and 0’s elsewhere. One can easily show that A P A£ = I and that A^A p = Sj where Sj is 
a matrix consisting on 0 ’s except for those rows corresponding to a point in if, such rows containing a 
1 for die diagonal element. Thus, the projection matrix P = I 5$ consists of all 0 s except for the 
diagonal elements in those rows corresponding to a grid point not in if, such elements being 1. The 
effect of the projection matrix P is to ignore any components of the direction vector d corresponding 

to a known point, while preserving all other components, unaltered. 

Recall that the direction vector d is determined by the projection of the negative gradient of 
die objective function. By expanding die double summadon and performing the differentiation, the 
components of die gradient of the objective function are given, in this case, by the following: 


For all elements in the center of the grid, apply die following 
surface function s: 

r 2 


stencil to the grid representation of the 


4 -16 4 

2 _16 40 —16 2 • 

4 —16 4 

2 

By this, we mean diat given a two-dimensional grid representation of die current surface approxima¬ 
tion, s, the value of die component of die gradient of die objective surface at some point ( i, j) on the 
grid is obtained by applying the above stencil centered over diat point (i, j), multiplying the value of 
each of the stencil points with the value of the surface at that point and summing diese products. The 


value of the components of the gradient can be computed in this manner by applying die stencil to all 


points in the center of the grid. 

Along die outer edges of the grid, the above stencil does not apply. Instead, a careful expansion of the 
gradient of the objective function shows that the following stencils should be used. 

For elements in the corners of the grid, apply die following stencil (or its appropriate rotations and 
reflections) to the grid representation of the surface function s: 
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For elements along an outside row of the grid, one point removed from the corner, apply the fol¬ 
lowing stencil (or its appropriate rotations and reflections) to the grid representation of the surface 
function s: 

2 

4 -12 4 

—8 20 —12 2 

For elements along an outside row of the grid, more than one point removed from §ny corner, apply 
the following stencil (or its appropriate rotations and reflections) to the grid representation of thg 
surface function s: 

mm «| 

2 

4 -12 4 

2 —12 22 —12 2 _ 

For elements along a row second from the outside of the grid, located ope element from each of 
two outside rows, apply the following stencil (or its appropriate rotations and reflections) to the grid 
representation of die surface function s: 

m » m 

2 

4 -16 4 

— 12 36 —16 2 ' 

4 -12 4 

For all other elements along a row second from the outside of the grid, apply die following stencil (or 

its appropriate rotations and reflections) to the grid representation of die surface function s: 

2 

4 -16 4 

2 —16 38 —16 2 

4 -12 4 


Thus, the direction vector has zero valued components at all points corresponding to known 
depth values, and non-zero valued components elsewhere, with value given by the result of convol v¬ 
ing the above stencils with the current surface approximation. It is interesting to note that the stencil 
used in the interior of the grid is a finite difference approximation to the Inharmonic equation V's 
0 |Al>i,iiii,’..a. and Stegun, 1965. p.885]. This should not be surprising, since the I'.uler equation. 
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derived previously from the calculus of variations, was precisely this equation. Thus, we see that the 
quadratic programming algorithms implicitly solve the Euler equation. 

We have determined the form of the direction vector, which specifies the direction in which to 
move in order to reduce the objective function and refine the surface approximation. To determine 
tire amount to move in this direction, it is necessary to determine the minimum value of the objective 
function along this direction, that is, to determine the value of a such that 




m—2 m —1 s 

2 2 ( — 2 s (bj)+ s (i+b;) 

a* —i /% ..... -n V 


i=i j —o 


-f-ad(i—ij) — 2ad(;j) + Qd (i+t,j)J 

m —1 m —2 ✓ 

+ 2 2 2 s (i,j) + s (‘.i+t) 

i=o J=1 '• 

i) 2 ad(ij) + J 

m —2 m—2 ✓ 

+2 Xj Xj f 5 ^) fy'-Hii) 

„• n a —n \ 


i=0 i=0 


( 

—j—cic/(j ? j) f~i j) 


is minimized. A straightforward application of calculus determines that this value for a is given by the 


ratio of o — where 
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and 


m— 2 m— 1 ✓ \ 

M " ii . h im P—w— / \ 

Q i = 2 j X l ^-LJ) ~ 2s (bi) + «(i+i,J) I 

z=l j=0 ' y 


( 


2d(i j) + d(i-f-ij) 


) 


m —1 ra —2 


+ 


ri—l 7 n—& s 

\—\ I 

X X 

i=0 7=1 v 


l) ~ 2 s (bi) +-^»,i+l) 


) 


1 ) ~~ 2 < %J) + d (iJ+l)j 

rn —2 m —2 / \ 

+ 2 X X (4* ,i) ~ s (H-l,j') — S (i,i+1) + s (t+l,j+l) ) 

o i=o ^ ' 

fd( it j) — d(i-f ij) — d(i,i+i) + rf (i+i,i+i)J 


m —2 m —1 ✓ \ 2 

02 ~ X X ” 2d (U) + d (i+t,f)J 


rn—1 rn—2 
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rn—l rn —z / 

V " 1 / ^ 

2-1 ( d ("j 

<l~0 j—J ^ 


l)— 2d {i,j) 4" ^(Li+i) 


f 


m—2 m—2 ✓ \ 2 

+ 2 X X ( d (Lj) “ d (H Li) — d (i,J+l) 4- «fa+i,j+l)J • 

i=0 j=0 ' ' 

T hus, die algoritlim is completely detennined. T he steps consist of: 

0. Determine a feasible initial surface approximation (any surface approximation which contains the 
known stereo depth values C(,-, j) at the known points (*', j) E T will suffice). 

1. Compute the gradient of the objective function by convolving the grid representation of the cur¬ 
rent surface approximation with the stencils listed above. Compute the direction vector by taking the 
negative of the gradient, setting any components corresponding to known depth points to zero. 

2. Compute the scalar a which specifies the amount to move along the direction vector on the 
hypcrsurfacc defined by the objective function, by the formula given above. 

3. Refine the surface approximation by incrementing the current surface approximation with the 
scaled dim lion vector. 
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4. Return to step (1) and continue until the magnitudes of all components of the direction vector are 


smaller than some constant e. 


6.3 Examples of Interpolation 


We can demonstrate the effectiveness of the surface interpolation algorithm by considering the 
performance of tire gradient projection algorithm on several examples. Although the previous discus¬ 
sion dealt specifically with applying the gradient projection algorithm to the quadratic variation, a 
similar analysis can be performed for other functionals such as tire square Laplacian. (Recall that any 
feasible functional was a linear combination of these two functionals.) 

To demonstrate both the effectiveness of tire interpolation algorithm and the difference between 
tire quadratic variation and the square Laplacian, we consider three synthetic examples in Figures 4- 
6 . In Figure 4, the interpolation algorithm is given as boundary conditions a set of closed contours 
from a cylinder, oriented parallel to the x-axis. It can be seen that the surfaces obtained by minimizing 
the square Laplacian and the quadratic variation differ markedly along tire edge of tire region. This 
is to be expected for two reasons. In Section 5, we derived the Euler equations for the interpolation 
problem, a set of differential equations which must be satisfied by the minimal surface. The Euler 
equations for tire interior of a region were identical for both tire square Laplacian and the quadratic 
variation, namely the biharmonic equation. Along the edges of the region, however, the natural 
boundary conditions imposed different equations on the solution surface. This fact is rcilcctcd in 
Figure 4. The second reason for tire different surfaces arises from the form of the stencils obtained 
in Section 6.2. The stencils to be applied at tire edges of a region in the case of quadratic variation 
arc numerically more stable than those to be applied in the case of the square Laplacian. (This may, 
in fact, simply be a reflection of the difference in Euler equations.) In cither case, it can be seen 
from Figure 4 that while minimizing tire quadratic variation results in a reasonable approximation to a 
cylinder, minimizing the square Laplacian yields less acceptable results. 

In Figure 5, we illustrate a second synthetic example. In this case, tire boundary conditions are 
points lying on a hyperbolic paraboloid, chooscn at random so that the known points do not form 
closed contours. As in the previous case, it can be seen that while the surfaces obtained by minimizing 
the two functionals arc very' similar in the interior of the region, the surfaces differ drastically along 
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Figure 4. Synthetic Example. The top figure shows a synthetic set of boundary conditions, 
consistent with a cylinder aligned with die axes of the grid. The middle figure shows die surface 
obtained by applying the gradient projection algorithm to the square Laplacian functional. The 
bottom figure shows the surface obtained by applying the algorithm to The quadratic variation. 
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Figure 5. Synthetic Example. The top figure shows a synthetic set of boundary conditions, consistent 
vhh a hyperbolic paraboloid. Tlte points are chosen at random with a density of 10 percent, 
[he middle Retire shows the surface obtained by applying the gradient projection algorithm 1.0 
he square Laplacian functional. The bottom figure shows the surface obtained by applying the 
ileorithm to the quadratic variation. „ 







































the edges of the region. Again, minimizing the quadratic variation yields a reasonable approximation 
to a hyperbolic paraboloid. 

In Figure 6, we illustrate a third synthetic example. In this case, the boundary conditions are 
again taken from a cylinder, here oriented at 45 degrees to the x-axis. As in the previous cylinder 
example, the major difference between the two surfaces occurs along the borders of the region and the 
minimization of quadratic variation yields a good approximation to the cylinder. 

The interpolation algorithm was developed to account for the creation of complete surface 
descriptions from the sparse surface information provided by visual modules such as stereo. We can 
demonstrate the effectiveness of the interpolation theory by applying the algorithm to different stereo 
examples. In Figures 7-10, we illustrate the results of applying the surface interpolation algorithm 
to the output of the Grimson implementation [Grimson, 1981a, 1981b] of the Marr-Poggio stereo 
theory [Marr and Poggio, 1979] applied to a pair of stereo images. It should be noted that in these ex¬ 
amples the interpolation algorithm was applied directly to the disparity values obtained by the stereo 
algorithm, without converting them to depth information. As a consequence, the displayed surfaces 
in the figures will not exactly reflect the shape of the surface, since an additional nonlinear transfor¬ 
mation from disparity to depth is still required. For the purposes of illustrating the interpolation 
algorithm, however, the use of interpolated disparity values suffices, since the interpolation algorithm 
will preserve the general shape of the surfaces (that is, the sign of the surface curvature) as well as the 

relative differences in depth between different surfaces. 

Figure 7 shows four stereo pairs of images, on which the algorithm was tested. Figure 8 shows 
the surface obtained for a wedding cake random dot stereogram. The four planar surfaces arc clearly 
visible, although the effect of a small number of incorrect disparity values at the junctions of adjacent 


planes can be seen. Figure 9 shows the surface obtained for a spiral staircase random dot stereogram. 
Again, while the general shape of the spiral staircase is clearly apparent, the effect of a small number 
of incorrect disparity values can be seen. Figure 10 shows the surface obtained for the natural image 
of a coffee jar. As in the previous cases, the general shape of the surfaces arc clearly evident. Not 


only is the jar sharply separated in disparity from the background plane (which is slightly slanted), but 
the overall shape of the jar can be distinguished. Figure 11 shows the surface obtained for the natural 
image of the Moore sculpture. As in the case of Figure 10, the general shape of the surface can be 










I i. r nr "h l-xomples of Mcreo (mages. The figures, from top to bottom, show the nuidorn dot 
•*t ! »v* y. am o! a wedding cuke, the random dot stereogram of a spiral vein ease, a naiuntf image 
of a coilee bottle, and a natural image of a sculpture by Henry Moore. 






















Figure 8. The Wedding Cake. The figure shows the surface obtained b 
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distinguished. Note that because no disparity values can be obtained for the hole in the center of the 
sculpture, the interpolation algorithm interpolates across the hole as if it were a uniform surface. 


6.4 The Conjugate Gradient Approximation Algorithm 

Previous sections have addressed the case of interpolating a surface through the known stereo 
depth values. In this section, the case of approximating a surface relative to the known stereo depth 
values is considered. There are several reasons for considering an approximation of the known depth 
values, rather than an exact fit through them. The first reason is that the accuracy of the stereo data 
may not be sufficient for the purpose of surface approximation. In particular, the algorithm outlined 
for performing the stereo computation yields disparity matches with an accuracy of one picture ele¬ 
ment. One must consider if such accuracy is sufficient. As well, one must consider the accuracy 
with which the zero-crossing positions reflect the location of a point of interest on an object in the 
scene. Since die operators which extract the zero-crossings have a non-infinitesimal spatial extent, it 
is possible that the zero-crossing positions undergo slight fluctuations in position, such fluctuations 


causing a small error in the dispaiity matches assigned by the algoiithrn. The second reason is that 
the stereo algorithm does occasionally make an incorrect match If an exact surface interpolation is 
reejuired, such points will incorrectly cause a change in the shape oi the surface, and the effect of 


such points can spread over a noticeable region of the surface reconstruction. By requiring a surface 


approximation, the effect of such “bad” disparity points can be minimized. 

The basic notion is to combine a measure of “nearness of fit to the known points” with 
a measure of the consistency of the surface with the zero-crossing information. This can be ac¬ 
complished by considering a penalty method unconstrained optimization problem. Here, the objec¬ 
tive function to minimize is . 

0(s) = J f ^ + s 2 yy + 2 dxdy + /? (*(*, y) — c{x, y)f 


where the summation takes place over the set If of all points in the repiesentation foi which tlieie is a 
known stereo depth value c[x } y). I he effect of this objective function is to minimize a least sQiiares 
fit through the known points, scaled relative to the original minimization problem. I he constant/? 
is a scale parameter to be determined by the degree of desired fit. Note that the constiaints have, in 
this case, been incorporated directly into the objective function. Hence, the objective function may be 
optimized as if it were an unconstrained function. 
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The translation of this problem into the image domain yields the following discrete version of 
the objective function: 


minimize 


m —2 m —1 > \ ; 

X] X] ( — 2s (bi) + s (*+i) 

•=i i—o X ' 


m —1 m —2 s 

+ Xj X] ( ~ 2 s (bi) + S (hi+I)j 

1=0 j=l X ' 


m~—2m—2 ✓ \ - 

+ 2 X] Xj ( “ fy'+U) ~ s (iJ+i) + s (i+l,j'+t)) 

i=0 j =0 X '' 

+/3 Xj (^bi) - c (»,i)) • 


It is now possible to consider applying the conjugate gradient method to tills problem. Recall 
that this method, when applied to the quadratic case, is considered the minimization of 

^s r Qs — b T s. 


In this case, the vector b is given by 


•2/?c 


where c is a vector whose components are the known depth values c^ >3) if the corresponding grid 
point has such a known value, and 0 otherwise. The matrix Q is given by the discrete stencils outlined 
in the previous section, with an added diagonal factor of S. f . One can then straightforwardly apply the 
conjugate gradient algorithm, with these forms for b and Q. 

6.5 Examples of Approximation 

The conjugate gradient algorithm, applied to the surface approximation problem, is 
demonstrated by considering a series of examples, illustrated in Figures 12-16. These should be com* 
pared to Figures 8-11. As in the case of surface interpolation, the surface approximation algorithm 
has been applied to disparity values rather than depth information. Again, the general shape of the 
surface and the relative difference in positions of the surfaces have been preserved by the algorithm, 


although the exact surface shape has not been reconstructed. 
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The objective function of the conjugate gradient algorithm in this case contains two factors, 
the quadratic variation of the surface and a least-squares term embodying a type of “smoothness” 
requirement. The scalar constant (3 determines the relative strengths of these two factors. If we let 
j3 be very small, then the smoothness requirement essentially vanishes and we return to the case of 
surface interpolation, discussed previously. The conjugate gradient algorithm then becomes identical 
to the gradient projection algorithm. If we let j3 become very large, then the quadrauc variation factor 
essentially vanishes and the algorithm reduces to a least-squares fitting of a plane to the known points. 
Clearly, we require a value of (3 intermediate to these extreme cases. The figures illustrate this tradeoff 
between the two factors, as (3 varies. To determine the optimal value for / 3 , we require an estimate for 
the density of incorrect disparity values obtained by the stereo algorithm, so that a value for (3 may 
be chosen which smooths out the effect of these incorrect values, while not affecting the shape of the 
surface determined by minimizing the quadratic variation. 



7. Analysis and Refinements 


7.1 Discontinuities 


One of the implicit assumptions of the interpolation algorithm is that the pieces of surface are 
in fact pieces of a single surface. Of course, this will frequently not be the case. In this section, we 
consider what modifications arc necessary in order to account for the existence of several surfaces 
within a scene. In particular, we address the issue of explicitly computing discontinuities in the surface 
representation, and the effects of explicit discontinuities on the form of the reconstructed surface. 

One of the problems associated with the failure to make surface discontinuities explicit is that 
information about the shape of one surface affects the shape of an adjacent surface. This is illustrated 
in Figure 17. A set of known depth points is given in Figure 17(a). Intuitively, the most likely 
surface to fit through these points would be a pair of planes with a discontinuity in depth between 
them, shown in Figure \l(b). However, tiie requirement that a smooth surface fit through these points 
results in a warping and rippling of the surface that is undesirable, as shown in Figure 17(c). 1 hus, 
the lack of explicit discontinuities can affect the shapes of the interpolated surfaces in an unacceptable 


manner. 




Figure 12. The Wedding Cuke. The figures show surfaces obtained by approximating the surface 
ising quadratic variation. The scalar constant relating the least squares term to the quadratic term 
5 j3 = 1 in the top figure and ft = 0.1 in the bottom figure. 

























































































































































































































figure 15. i he Codec Jar. The figures show two views of a surface obtained by approximating 
the surface using quadratic \aiiaUon. the scalar constant relating the least squares term to the 
quadratic term is (3 ~~ 0.01. 
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Figure 17. Discontinuities in the Surfaces. Figure (£ 2 ) shows a set of known data points. Intuitively, 
the correct reconstructed surface would be a pair of planes, with a discontinuity between them, as 
shown in figure (b). If the interpolation algorithm attempts to reconstruct a surface thiough the 
houndaiy points, without a discontinuity, the result is as shown in ftgmc (r). I he shatp change 
in depth results in a rippling of the surface. 
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In order to make discontinuities explicit, there are several questions to ask about the process. 
How are the discontinuities detected? Where are they placed in the representation? When does the 
detection of discontinuities take place in the overall interpolation process? In the next few sections, 
we will discuss two possible methods for detecting the discontinuities, and their role in the overall 
interpolation. 

7.1.1 Occlusions in the Stereo Algorithm 


Consider the geometry indicated in Figure 18. There are regions of the left image which will not 
have a corresponding region in the right image, and vice versa. Consequently, any zero-crossings in 
this portion of one image will have no counterpart in the other image, and the stereo algorithm should 
not assign any match to such zero-crossings. Hence, one possible mechanism for detecting occlusions 
would be to search for portions of the image which contain unmatched zero-crossings. Then, the 
interpolation can be restricted to take place only over those sections of the image which are bounded 
by zero-crossings with known disparity values. 

This method would detect die discontinuities before the interpolation, since it uses stereo in¬ 
formation directly to locate the occlusions. A problem with tire method is that it will not detect 
all discontinuities, only those in the horizontal direction. Discontinuities that occur in the vertical 
direction do not cause occlusions. Hence, any method for detecting discontinuities which relies only 
on the unmatched zero-crossings will be incomplete. 

7.1.2 The Primal Sketch Revisited 

An integral part of most computational theories, proposed as models of aspects of the human 

* 

visual system, is the use of computational constraints based on assumptions about die physical world 
[Marr, 1976, 1980; Marr and Poggio, 1979; Marr and Hildreth, 1980; Ullman, 1979], In some of the 
computational theories, the constraints arc explicitly checked for validity within the algorithm (e.g. 
Ullman’s rigidity constraint in recovering staicture from motion). In others, the constraints arc simply 
assumed to be true, and are not explicitly checked (e.g. Marr and Poggio’s uniqueness constraint in 
stereopsis). Can any aspect of the surface consistency constraint be explicitly checked and used by the 
algorithm? 

The basic notion of the surface consistency constraint is that the surface cannot undergo a radi¬ 
cal change in shape without having an accompanying zero-crossing in the convolved image. Implicit 
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tiie region of the surface visible to the left eye, but not to the right. 


in this constraint is the assumption that the portion of the image being examined in fact corresponds 
to a single object. Thus, one could propose that if the shape of the interpolated surface forces a zero¬ 
crossing in a location for which none exists in the Primal Sketch, then such a zero-crossing indicates 
a location at which the assumption of a single object is violated. Such zero-crossings could then be 
taken as indicative of a surface discontinuity. 

Perhaps the simplest method of detecting such discontintuities is again to use ideas inherent in 
the Primal Sketch. Recall that the Primal Sketch created descriptions of points in the image associated 
with inflections in intensity, for a range of resolutions. Since the image intensities may be considered 
as a type of three-dimensional surface, tire Primal Sketch operators essentially detect discontinuities 
in the image intensities for a range of resolutions. Thus, one could apply the same type of analysis 
to tire detection of surface discontinuities, where now the surface on which the operators apply is the 
reconstructed depth surface, rather than the intensity surface. 
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It is worth noting that not only should the operators be of the form used in the extraction 
of the Primal Sketch, but that it may also be useful to use a range of operators, as in the Primal 
Sketch. One reason for using multiple zero-crossing detectors was that surface changes, and hence 
intensity changes, could take place over a wide range of scales. This is still true in the case of 
surface descriptions, such as have been constructed for the coffee jar or the wedding cake. Thus, 
surface discontinuities corresponding to occluding edges will frequently tend to correspond to large 
surface changes, while internal surface discontinuities, due to a warping of the surface, will tend 
to correspond to small surface changes. By using a range of V 2 G operators, one can extract both 
occluding contour discontinuities, as well as ripples or warpings of the surface itself. 

Note that this method requires diat the surface interpolation already take place, before it can be 
applied. Since one of the general requirements on an algorithm is that it be rapid, we must consider 
the consequence of detecting discontinuities after the interpolation of the surfaces. There are two 
main reasons for the explicit detection of discontinuities. One is that such an explicit representation 
of this information will allow higher level processes, such as recognition, or extraction of axes for 
three-dimensional shape analysis, to operate more easily, since the process serves to make implicit 
information explicit. However, a second reason is to create more accurate surface representations, by 
removing the type of effect illustrated in Figure 17(c). If the process used to isolate discontinuities 
takes place after interpolation, and if tlie interpolation process requires the discontinuities to improve 
the interpolated surface approximation, one must propose an interpolater which passes over the sur¬ 
face information twice; first to produce an initial description, and second to refine die description 
after the detection of discontinuities. One must then question whether such a two pass process will 
affect our constraint of rapid algorithms. Fortunately, the answer is no, since the surface approxima¬ 
tion obtained without explicitly accounting for the discontinuities is very close to the limiting surface 
except in the areas of the discontinuities (that is, any effects of die discontinuities are quickly damped 
out as one moves across the surface). Thus, the initial starting position for die second pass of the 
interpolation algorithm is very close to the limiting surface, and only a few iterations will be needed to 
refine the surface approximation. 




7.1.3 Interpolation Over Occluded Regions 

i 

Even though occluded regions of the image can only be viewed from one eye, the human system 
still associates a depth value with these regions. This has an interesting implication for the interpola¬ 
tion algorithm. For most occluded regions, the only depth information available is at the edges of the 
occluded region. Psychophysical experiments have shown that the occluded region is always perceived 
at the depth of the lower surface. Thus, in Figure 18, the occluded region would be perceived at the 
level of the lower surface. Note that this is consistent with the physics of the situation, since if the 
occluded region were perceived at the level of the upper surface, then it should in fact be visible to the 

right eye, and this is not the case. 

This observation suggests that when an occlusion is detected, it is explicitly located along the 
occluding boundary corresponding to the edge of the nearer object. This allows the occluded region 
itself to be associated with the lower surface, and the interpolation algorithm will fill in surface values 
for the occluded region from this lower surface. 

This raises an interesting psychophysical prediction. The psychophysical literature has examined 
the case of planar surfaces and their occlusions, as in Figure 18. If the interpolation method developed 
here is given an explicit discontinuity along one edge of the occluded region, it will correctly fill in 
the region as an extension of the lower plane. Of interest is the case in which the occluded region is 
not planar. For example, consider a cylindrical object. If the interpolation algorithm is given this type 
of input, it will fill in the occluded portions of the surfaces as a smooth continuation of the curved 
cylinder. If the interpolation algorithm correctly models interpolation by die human visual system, 
then this predicts that die surface perception for human observers in diis situation should also be that 
of a smooth cylinder. While informal experiments indicate that this is true, the prediction has not yet 
been rigourously tested psychophysically. 

7.2 Noise Removal 


Although in general the Marr-Poggio stereo algorithm is very good at matching zero-crossings 


correctly (especially for random dot patterns), incorrect disparity values may sometimes be assigned to 
regions of the image. These incorrect values can be considered as noise superimposed on the correct 


surface. Since the surface interpolator explicitly attempts to fit a surface through all the disparity 
points, such noise points can affect the shape of the surface approximation. Indeed, the effect of these 
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noise points can spread over a noticeable portion of the surface, before the nearby disparity values can 
damp out its effect. Thus, it would be preferable to remove these noise points, or at least neutralize 
their effect on the approximated surface shape. One possibility is that if a two pass interpolator is 
used, as suggested in the previous section, the detection of surface discontinuities will isolate such 
noise points from the rest of the surface, and the second pass of the interpolator will adjust the surface 
approximation to remove the influence of the noise points on the first pass approximation. Certainly 
this will be true for noise points with disparity values far removed from the correct values. For 
noise points whose disparity values are only slightly different from the correct surface disparities, the 
difference does not really matter. However, the final result would be that the noise points, while being 
isolated from the rest of the correct surface, would still remain in the final surface description. It 
would be preferable to completely remove such points. 

Is it possible to identify and remove noise points from the disparity map? If the noise points are 
isolated spatially, then it is possible to identify diem as undesirable. This follows from die form of the 
primal sketch operators. The case to consider is that in which one must distinguish between a set of 
noise points in a disparity map and a small object separated in depth from the rest of die scene. For 
the small object, the size of the zero-crossing contour is limited by the size of the available operator, 
and hence there is a minimum size of zero-crossing contour which die operator will yield about the 
object. If the number of zero-crossing points which differ significantly from their neighbors is less 
than this minimum, one may conclude diat the points are noise, and thus remove them. This will 
result in an improved surface approximation. 


7.3 Acuity 


It can be seen from the example of the interpolated coffee jar in Figure 10, tiiat die interpolated 
surface contains a bumpy quality which clearly is not consistent with die original object. How can 
this be explained? The effect occurs in part because the disparity values are specified only to within a 
pixel. This yields a fairly coarse disparity map which results in the bumps observed in the interpolated 
coffee jar of Figures 14 and 15. Hence, one method of removing the bumps would be to improve 
the accuracy of the disparities obtained by the algorithm. Note that some improvement in disparity 
accuracy is necessary if the algorithm is to be consistent with the human system. If we roughly equate 
pixels with receptors, then a pixel corresponds to roughly 27 seconds of arc. The implementation of 






the stereo algorithm computed disparity to within a pixel, while humans are capable of stereo acuity 
to a resolution of 2 — 10 seconds [Howard, 1919; Woodburne, 1934; Berry, 1948; Tyler, 1977]. 

In order to account for finer disparity values, it is necessary to localize the zero-crossing to a bet¬ 
ter accuracy than has been done so far. Since the convolution values are only specified at each pixel, 
one method for more accurately specifying the zero-crossing positions is to interpolate between the 
known convolution values [Crick, Marr and Poggio 1980, Marr, Poggio and Hildreth 1979, Hildreth 
1980]. Perhaps the simplest method is to rely on the observation of Hildreth that for most cases, 
even a simple linear interpolation will give extremely accurate localization of the zero-crossings. The 
addition of finer resolution depth information may improve the performance of the algorithm. 

This example also raises a question of scale. Depending on the application of the surface 
specification, different amounts of resolution may be required. For example, if the ultimate goal of 
the surface specification is to obtain a rough idea of the position and shape of the surfaces in a scene, 
the spatial resolution at which surface information must be made explicit may not be critical. In this 
case, the known data from the stereo algorithm may be sampled at a coarser resolution, before the 
interpolation takes place. This should result in a smoother surface approximation. Further, although 
the reconstructed surface is less exact in terms of fine variation of the surface shape, the overall shape 
of tire bottle is still preserved in this interpolation. 


7.4 Psychophysics 


We close by listing a series of psychophysical questions of relevance to the interpolation process. 
(1) What is the form of the surface perceived in occluded regions? In particular, the minimization of 
quadratic variation suggests that if a portion of a curved object is occluded, then the surface in 
the occluded region should also be curved, and should minimize the quadratic variation across 


that region. 

(2) Figure 17 suggests that if discontinuities are not explicitly demarked in tire interpolation 
process, a warping of the reconstructed surface (similar to Gibb’s phenomena) will result. 
While, in principle, such ripples in the surface arc' undesirable, it is worth asking whether the 
human system specifically accounts for discontinuities before interpolation occurs. This may be 
rephrased by asking whether in stereoscopic situations similar to Figure 17, we perceive a Mach 
band-like warping of the surface in depth? 
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(3) Wc have suggested that there are several possible functionals which could be used to determine 
the most consistent surface. Based on algorithmic and mathematical arguments,, we choose the 
quadratic variation. Can we test the shape of the reconstructed surface psychophysically? In 
particular, can we distinguish psychophysically between the minimum surface under quadratic 
variation and the minimum surface under some other functional, such as the square Laplacian? 

v 

Is the reconstructed surface psychophysically consistent with the surface computed by quadratic 
variation? 

(4) What is the spatial resolution of the reconstructed surface? That is, what is the spacing of the 
grid upon which the values of the reconstructed surface are computed? 

The answers to these questions will help verify or correct the theory of visual surface interpola¬ 
tion developed in this paper. 


8. Summary 


Computational theories of motion perception [UUman, 1979] and stereo vision [Marr and 
Poggio, 1979] can only specify the computation of three-dimensional surface information at special 
points in die image. In order to account for the visual perception of complete surfaces, wc have 
developed a computational theory of the interpolation of surfaces from visual information. 

The problem is constrained by the fact that the surface must agree with the information from 
stereo or motion correspondence, and not vary radically between these points. In Crimson [1981c], an 
explicit form of this surface consistency constraint is derived from die image intensity equation [Horn, 
1975], The main point of the surface consistency constraint is that it requires the interpolated surface 


to vary as little as possible. 

To determine which of two possible surfaces is more consistent with the surface consistency 
constraint, one must be able to compare the two surfaces. To do this, a functional from the space 
of functions to the real numbers is required, where the functional should measure some function of 
the variation in die surface. In this way, the surface most consistent with the visual information will 
be that which minimizes the functional. To ensure that the functional has a unique minimal surface, 
conditions on the form of the functional arc derived. In particular, if the functional is a complete 
semi-norm which satisfies the parallelogram law, or the space of functions is a semi-Hilbert space and 






the functional is a semi-inner product, then there is a unique (to within an element of the null space of 
the functional) surface which is most consistent with the visual information. 

It can be shown, based on the above conditions plus a condition of rotational symmetry, that 
there is a vector space of possible functionals which measure surface consistency, this vector space 
being spanned by die functional of quadratic variation and the functional of square Laplacian (Brady 
and Horn, 1981). Arguments based on the null spaces of the respective functionals were used to justify 
die choice of the quadratic variation as the optimal functional. 

Algorithms for computing the surface which minimizes quadratic variation in die case of exact 
surface interpolation and in the case of surface approximation were outlined and illustrated on a series 
of synthetic and actual surface interpolation examples. 
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